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GLY COS XDASE ENZYMES 

This application is a continuation-in-part of pending 
patent application 08/583,787 filed January 11 , 1996. 

This invention relates to newly identified 
polynucleotides, polypeptides encoded by such 
polynucleotides, the use of such polynucleotides and 
polypeptides, as well as the production and isolation of such 
polynucleotides and polypeptides* More particularly, the 
polynucleotides and polypeptides of the present invention has 
been putatively identified as glucosidases , a*galactosidases f 
/? - gal act osidases , E-mannos idases , E- mannanase s , 
endoglucanas.es ,_and pullalanases 

The glycosidic bond of 0-galactosides can be cleaved by 
different classes of enzymes: (i) phospho-0-galact osidases 
(EC3.2.1.85) are specific for a phosphorylated substrate 
generated via phosphoenolpyruvate phosphotransferase system 
(PTS) -dependent uptake; (ii) typical 0-galactosidaBes (EC 
3.2.1.23) , represented by the Escheri chia coli LacZ enzyme , 
which are relatively specific for 0-galactosides ; and (iii) 
£ -glucosidases (EC 3.2.1.21) such as the enzymes of 
Agrobacterlum faecalie, Clostridium thermocellum, Pyrococcue 
furiosus or Sulfolobus solfataricus (Day, A.G . and Withers, 
S.G., (1986) Purification and characterization of a /?- 
glucosidase from Alcali genes faecalie. Can. J. Biochem. Cell. 
Biol. 64, 914-922; Kengen, S.W.M., et al . (1993) Eur. J. 
Biochem., 213, 305-312; Ait, N. , Cruezet, N. and Cattaneo, J. 
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(1982) Properties of 0-glucosidase purified from Clo8tr±di\ 
thermocellvun. J. Gen. Microbiol. 128, 569-577; Grogan, D.W. 
(1991) Evidence that 0-galactosidase of Sulfolobus 
Bolfataricus is only one of several activities of a 
thermostable 0-D-glycodiase . Appl. Environ. Microbiol. 57, 
1644-1649) . Members of the latter group, although highly 
specific with respect to the /8-anomeric configuration of the 
glycosidic linkage, often display a rather relaxed substrate 
specificity and hydrolyse 0-glucosides as well as 0-fucosides 
and 0-galactosides . 

Generally, cr-galactosidases are enzymes that catalyze 
the hydrolysis of galactose groups on a polysaccaride 
backbone or hydrolyze the cleavage of di- or oligosaccharides 
comprising galactose. 

Generally, B-mannanases are enzymes that catalyze the 
hydrolysis of mannose groups internally on a polysaccaride 
backbone or hydrolyze the cleavage of di- or 
oligosaccaharides comprising mannose groups. S-mannos idases 
hydrolyze non- reducing, terminal mannose residues on a 
mannose-containing polysaccharide and the cleavage of di- or 
oligosaccaharides comprising mannose groups. 

Guar gum is a branched galactomannan polysaccharide 
composed of 0-1,4 linked mannose backbone with a- 1,6 linked 
galactose sidechains. The enzymes required for the 
degradation of guar are 0-mannanase, 0-mannosidase and a- 
galactosidase. 0-mannanase hydrolyses the mannose backbone 
internally and /3-mannosidase hydrolyses non- reducing, 
terminal mannose residues. a-galactosidase hydrolyses cr- 
linked galactose groups. 

Galactomannan polysaccharides and the enzymes that 
degrade them have a variety of applications. Guar is 
commonly used as a thickening agent in food and is utilized 
in hydraulic fracturing in oil and gas recovery. 
Consequently, galactomannanases are industrially relevant for 
the degradation and modification of guar. Furthermore, a 
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need exists for thermostable galactomannases that are active 
in extreme conditions associated with drilling and well 
stimulation. 

There are other applications for these enzymes in 
various industries, such as in the beet sugar industry. 20- 
30% of the domestic U.S. sucrose consumption is sucrose from 
sugar beets. Raw beet sugar can contain a small amount of 
raffinose when the sugar beets are stored -before processing 
and rotting begins to set in. Raffinose inhibits the 
crystallization of sucrose and also constitutes a hidden 
quantity of sucrose. Thus, there is merit to eliminating 
raffinose from raw beet sugar. or-Galactosidase has also been 
used as a digestive aid to break down raffinose, stachyose, 
and verbascose in such foods as beans and other gassy foods. 

£-Galactosidases which are active and stable at high 
temperatures appear to be superior enzymes for the production 
of lactose -free dietary milk products (Chaplin, M.F. and 
Bucke, C. (1590) In: Enzyme Technology, pp. 159-160, 
Cambridge University Press, Cambridge, UK) . Also, several 
studies have demonstrated the applicability of 0- 
galactosidases to the enzymatic synthesis of oligosaccharides 
via transglycosylaticn reactions (Nilsson, KTGTlZ (1988) 
Enzymatic synthesis of oligosaccharides. Trends Biotechnol. 

6, 156-264; Cote, G.L. and Tao, B.Y. (1990) Oligosaccharide 
synthesis by enzymatic transglycosylation. Glycocon jugate J. 

7, 145-162) . Despite the commercial potential, only a few /3- 
galactosidases of thermophiles have been characterized so 
far. Two genes reported are 0-galactoside- cleaving enzymes 
of the hyperthermophilic bacterium Thermo toga marl tinza , one 
of the most thermophilic organotrophic eubacteria described 
to date (Huber, R. , Langworthy, T.A. , Konig, H. , Thomm, M. , 
Woese, C.R. , Sleytr, U.B. and Stetter, K.O. (1986) T. martima 
sp. nov. represents a new genus of unique extremely 
thermophilic eubacteria growing up to 90°C, Arch. Microbiol. 
144, 324-333) one of the most thermophilic organotrophic 

-3- 



BNSDOCfD: <WO 972541 7A1J_> 



WO 97/25417 



PCT/US97/00092 



eubacteria described to date. The gene products have been 
identified as a 0-galactosidase and a 0-glucosidase . 

Pullulanase is well known as a debranching enzyme of 
pullulan and starch- The enzyme hydrolyzes a-l, 6-glucosidic 
linkages on these polymers. Starch degradation for th 
eproduction or sweeteners (glucose or maltose) is a very 
important industrial application of this enzyme. The 
degradation of starch is developed in two stages. The first 
stage involves the liquefaction of the substrate with a- 
amylase, and the second stage, or saccharif ication stage, is 
performed by S-amylase with pullalanase added as a 
debranching enzyme, to obtain better yields. 

Endoglucanases can be used in a variety of industrial 
applications. For instance, the endoglucanases of the 
present invention can hydrolyze the internal B-l , 4-glycosidic 
bonds in cellulose, which may be used for the conversion of 
plant biomass into fuels and chemicals . Endoglucanases also 
have applications in detergent formulations, the textile 
industry, in animal feed, in waste treatment, and in the 
fruit juice and brewing industry for th eclarif ication and 

extraction of juices. 

The polynucleotides and polypeptides of the present 
invention have been identified as glucosidases , a- 
galactosidases. 0-galactosidases , fc-mannosidases. S- 
mannanases, endoglucanases, and pullalanases as a result of 
their enzymatic activity. 

In accordance with one aspect of the present invention, 
there are provided novel enzymes, as well as active 
fragments, analogs and derivatives thereof. 

In accordance with another aspect of the present 
invention, there are provided isolated nucleic acid molecules 
encoding the enzymes of the present invention including 
mRNAs, cDNAs, genomic DNAs as well as active analogs and 
fragments of such enzymes. 
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in accordance with another aspect of the present 
xnvention there are provided isolated nucleic acid molecules 
encoding mature polypeptides expressed by the DNA contained 
xn ATCC Deposit No. 97379. 

In accordance with yet a further aspect of the present 
xnvention, there is provided a process for producing such 
polypeptides by recombinant techniques comprising culturing 
recombinant prokaryotic and/or eukaryotic host cells 
containing a nucleic acid sequence of the present invention' 
under conditions promoting expression of said enzymes and 
subsequent recovery of said enzymes. 

In accordance with yet a further aspect of the present 
xnvention, there is provided a process for utilizing such 
enzymes, or polynucleotides encoding such enzymes f or 
hydrolyzing lactose to galactose and glucose for use in the 
food processing industry, the pharmaceutical industry for 
example, to treat intolerance to lactose, as a diagnostic 
reporter molecule, in corn wet milling, in the fruit juice 
xndustry, in baking, in the textile industry and in the 
detergent industry. 

In accordance with yet a further aspect of the present 
xnvention, there is provided' a process for utilizing such 
enzymes for hydrolyzing guar gum ( a galactomannan 
polysaccharide) to remove non-reducing terminal mannose 
residues. Further polysaccharides such as galactomannan and 
the enzymes according to the invention that degrade them have 
a varxtey of applications. Guar gum is commonly used as a 
thickening agent in food and also is utilized in hydraulic 
fracturing in oil and gas recovery. Consequently, mannanases 
are industrially relevant for the degradation and 
modification of guar gums. Furthermore, a need exists for 
thermostable mannases that are active in extreme conditions 
associated with drilling and well stimulation. 

In accordance with yet a further aspect of the present 
xnventxon. there are also provided nucleic acid probes 
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comprising nucleic acid molecules of sufficient length to 
specifically hybridize to a nucleic acid sequence of the 

present invention. 

In accordance with yet a further aspect of the present 
invention, there is provided a process for utilizing such 
enzymes, or polynucleotides encoding such enzymes, for in 
vitro purposes related to scientific research, for example, 
to generate probes for identifying similar sequences which 
might encode similar enzymes from other organisms by using 
certain regions, i.e., conserved sequence regions, of the 

nucleotide sequence. 

These and other aspects of the present invention should 
be apparent to those skilled in the art from the teachings 
herein - 

*T"i*>f peprriution of the Drawings 
The following drawings are illustrative of embodiments 
of the invention and are not meant to limit the scope of the 
invention as encompassed by the claims . ^ ^ , 

Figure 1 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of M11TI. of t*e 
present invention. Sequencing was performed using a 378 
automated DNA. sequencer for all sequences of the present 
invention (Applied Biosystems, Inc.). 

Figure 2 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of ™'*™**' Q ' 

Figure 3 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of ^- 12 <*' 

Figure 4 are illustrations of the full-length DNA and 
corresponding deduced amino acid sequence of ~" 3 »£- 

Figure 5 are illustrations of the full-length DNA and 
corresponding deduced amino acid sequence of WSBB - 6G - 

Figure S are illustrations of the full-length DNA and 
corresponding deduced amino acid sequence of «™^ B ^ 
Figure 7 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of GC74-22G. 



-6- 



BNSDOCIO: <WO 9725417A1_I_> 



WO 97/25417 



PCT/US97/00092 



Figure 8 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of VC1 -7G1 . 

Figure 9 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of 37GP1. 

Figure 10 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of 6GC2 . 

Figure 11 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of 6GP2 . 

Figure 12 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of 63GB1 . 

Figure 13 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of OC1/4V. 

Figure 14 is an illustration of the full-length DNA and 
corresponding deduced amino acid sequence of 6GP3 . 

Definitions 

The term "gene" means the segment of DNA involved in 
producing a polypeptide chain; it includes regions preceding 
and following the coding region (leader and trailer) as well 
as intervening sequences (introns) between individual coding 
segments (exons) . 

A coding sequence is "operably linked to" another coding 
sequence when RNA polymerase will transcribe the two coding 
sequences into a single mRNA, which is then translated into 
a single polypeptide having amino acids derived from both 
coding sequences . The coding sequences need not be 
contiguous to one another so long as the expressed sequences 
ultimately process to produce the desired protein. 

"Recombinant" enzymes refer to enzymes produced by 
recombinant DNA techniques; i.e., produced from cells 
transformed by an exogenous DNA construct encoding the 
desired enzyme. "Synthetic" enzymes are those prepared by 
chemical synthesis . 

A DNA "coding sequence of" or a "nucleotide sequence 
encoding" a particular enzyme, is a DNA sequence which is 
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transcribed and translated into an enzyme when placed under 
the control of appropriate regulatory sequences . 

Summary of the Invention 
In accordance with an aspect of the present invention, 
there are provided isolated nucleic acids (polynucleotides) 
which encode for the mature enzymes having the deduced amino 
acid sequences of Figures 1-14 (SEQ ID NOS; 15-28) . 

In accordance with another aspect of the present 
invention, there are provided isolated polynucleotides 
encoding the enzymes of the present invention. The deposited 
material is a mixture of genomic clones comprising DNA 
encoding an enzyme of the present invention. Each genomic 
clone comprising the respective DNA has been inserted into a 
pBluescript vector (Stratagene, L»a Jolla, CA) - The deposit 
has been deposited with the American Type Culture Collection, 
12301 Parklawn Drive, Rockville, Maryland 20852, USA, on 
December 13, 1995 and assigned ATCC Deposit No- 97379. 

The deposit (s) have been made under the terms of the 
Budapest Treaty on the International Recognition of the 
deposit of micro-organisms for purposes of patent procedure. 
The strains will be irrevocably and without restriction or 
condition released to the public upon the issuance of a 
patent . These deposits are provided merely as convenience to 
those of skill in the art and are not an admission that a 
deposit be required under 35 U.S.C. S112 . The sequences of 
the polynucleotides contained in the deposited materials, as 
well as the amino acid sequences of the polypeptides encoded 
thereby, are controlling in the event of any conflict with 
any description of sequences herein. A license may be 
required to make, use or sell the deposited materials, and no 
such license is hereby granted. 

retailed Description of the Invention 
The polynucleotides of this invention were originally 
recovered from genomic gene libraries derived from the 
following organisms: 
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M11T1. is a new species of DesulfurococcuB isolated from 
Diamond Pool in Yellowstone National Park. The organism 
grows optimally at 85-8 8°C, pH 7.0 in a low salt medium 
containing yeast extract, peptone, and gelatin as substrates 
with a N 2 /C0 2 gas phase. 

OC1/4V is from the genus Thennotoga. The organism was 
isolated from Yellowstone National Park . It grows optimally 
at 75 °C in a low salt medium with cellulose as a substrate 
and N 2 in gas phase. 

Pyrococcue furiosus VC1 is from the genus Pyrococcus . 
VC1 was isolated from Vulcano, Italy. It grows optimally at 
lOO^C in a high salt medium (marine) containing elemental 
sulfur , yeast extract , peptone and starch as substrates and 
N 2 in gas phase . 

StaphylothexmuB marlnus Fl is a from the genus 
Staphylotherrous. Fl was isolated from Vulcano, Italy. It 
grows optimally at 85°C, pH 6.5 in high salt medium (marine) 
containing elemental sulfur and yeast extract as substrates 
and N 2 in gas phase. 

Thennococcus 9N- 2 is from the genus Thermococcus 9N- 2 
was isolated from diffuse vent fluid in the East Pacific 
Rise. It is a strict anaerobe that grows optimally at 87°C. 

Thexmotoga maritima MSB 8 is from the genus Thexinotogo, 
and was isolated from Vulcano, Italy. MSB 8 grows optimally 
at 85°C, pH 6.5 in a high salt medium (marine) containing 
starch and yeast extract as substrates and N 2 in gas phase . 

Thermococcus alcallphilus AEDII12RA is from the genus 
ThexmococcuB . AEDII12RA grows optimally at 85°C, pH 9.5 in 
a high salt medium (marine) containing polysulfides and yeast 
extract as substrates and N 2 in gas phase . 

Thermococcus chi tonophagus GC74 is from the genus 
Thermococcus. GC74 grows optimally at 85 °C, pH 6.0 in a high 
salt medium (marine) containing chi tin, meat extract, 
elemental sulfur and yeast extract as substrates and N 2 in gas 

-9- 
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phase. AEPII la grows optimally at 85°C at pH 6.5 in marine 
medium under anaerobic conditions. It has many substrates. 
[Add descriptions of new organisms] 

Accordingly, the polynucleotides and enzymes encoded 
thereby are identified by the organism from which they were 
isolated, and are sometimes hereinafter referred to as 
"M11TL" (Figure 1 and SEQ ID NOS:l and 15), "OC1/4V-33B/G" 
(Figure 2 and SEQ ID NOS:2 and 16), "F1-12GT" (Figure 3 and 
SEQ ID NOS:3 and 17). "9N2-31B/G" (Figure 4 and SEQ ID NOS:4 
and 18), "MSB8" (Figure 5 and SEQ ID NOS:5 and 19), 
"AEDII12RA-18B/G" (Figure 6 and SEQ ID NOS:6 and 20) , "GC74- 
22G" (Figure 7 and SEQ ID NOS:7 and 21), "VC1-7G1" (Figure 8 
and SEQ ID NOS : 8 and 22) , "37GP1" (Figure 9 and SEQ ID NOS : 
9 and 23), "6GC2" (Figure 10 and SEQ ID NOS: 10 and 24). 
"6GP2" (Figure 11 and SEQ ID NOS ill and 25). "AEPII la" 
(Figure 12 and SEQ ID NOS:12 and 26) , "OC1/4V" (Figure 13 and 
SEQ ID NOS:13 and 27), and "6GP3" (Figure 14 and SEQ ID 
NOS: 28) . 

The polynucleotides and polypeptides of the present 
invention show identity at the nucleotide and protein level 
to known genes and proteins encoded thereby as shown in Table 
1. 

Table 1 



1 v ••^cibne 






^denfcity 1 


M11TL-29G 


Sulf olobus 
sulf ataricus DSM 
1616/P1, &- 
qalactosidase 


51% 


55% I 


OC1/4V-33B/G 


Caldocellum 

saccliarolyticum , 

ff -glucoeidaee 


52% 


57% I 


| staphylothezmue 
1 marinus F1-12G 


Bacillus polyrayxa, 
/3-galactosidase 


36% 


48V I 
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ThermococcxiB 
9N2-31B/G 


Sulf olobue 
eulfataricue ATCC 
49255/MT4, 0- 
galactosidafle 


51% 


50% 


Thermotoga 
mari tima MSB 8 - 
6G 


Clostridium 
thermocellum bglB 


45% 


53% 


ThermococcxiB 
AEDII12RA-18B/G 


Bacillus polymyxa, 
0 -galactosidase 


34% 


48% 


Thezmococcus 
chl tonophagus 
GC74-22G 


Sulf olobus 

sulf ataricus ATCC 

49255/MT4, 0- 

galactosidase 


46% 


54% 


Pyrococcus 
furlosus VC1- 
7G1 


Sulf olobus 

sulf ataricus/MT-4 

^-galactosidase 


46.4% 


52.5% 


Th ezmo toga 
maxrxtlma a- 
galactosidase 
(6GC2) 


Pediococcus 
pentosaceaus or- 
galactosidase 


49% 


29% 


Thexmotoga 
majritima B- 
mannanase 
(6GP2) 


Aspergillus 
aculeatus 


56% 


37% 


AEPII la S- 
| mannosidase 


Sulf olobus 

solf actaricus &- 


78% 


56% 


II (63GB1) 


galactosidase 






OC1/4V 

endoglucanase 
(33GP1) 


Clostridium 
thermocellum endo- 
1,4-S- 

endoglucanaee 


65% 


43% 


| Thermo toga 
| maritlma 
0 pullalanase 


Caldocellum 
saccharolyticum a- 
destrom 6 
gjiuc auonyara x a s e 


72 


53 


Bankia gouldi 
mix 

Endoglucanase 
(37GP1) 


None available 







The polynucleotides and enzymes of the present invention 
show homology to each other as shown in Table 2 . 
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Table 2 



Clone 


Gene /Protein with 
Closest Homoibgy 


Protein 
Identity 


Nucleic 

Acid 
Identity 


S taphyl a thexrous 


ThermacoccuB 
AEDII12RA- 18B/G , 
/3 - galact ob idase , 
qlucosidase 


55% 


57% 


1 Thermococcus 
9N2-31B/G 


ThennococcuB 
chi tonop-hagxiH 
GC74-22G- 
qIucob idase % 


74% 


66% 


Pyxococcus 
furloBUB VC1- 
7G1 


PyirococcuB 
furlosuB VC1-7B/G 
0 -galactosidase 


46.4% 


54% 



All the clones identified in Tables 1 and 2 encode 
polypeptides which have «-glycosidaee or 0-glycosidase 
activity. 

This invention, in addition to the isolated nucleic acid 
molecules encoding the enzymes of the present invention, also 
provide substantially similar sequences. Isolated nucleic 
acid sequences are substantially similar if: (i) they are 
capable of hybridizing under conditions hereinafter 
described, to the polynucleotides of SEQ ID NOS:l-8; (ii) or 
they encode DNA sequences which are degenerate to the 
polynucleotides of SEQ ID N0S:l-8. Degenerate DNA sequences 
encode the amino acid sequences of SEQ ID NOS:9-16, but have 
variations in the nucleotide coding sequences. As used 
herein, substantially similar refers to the sequences having 
similar identity to the sequences of the instant invention. 
The nucleotide sequences that are substantially the same can 
be identified by hybridization or by sequence comparison. 
Enzyme sequences that are substantially the same can be 
identified by one or more of the following: proteolytic 
digestion, gel electrophoresis and/or microsequencing . 
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One means for isolating the nucleic acid molecules 
encoding the enzymes of the present invention is to probe a 
gene library with a natural or artificially designed probe 
using art recognized procedures (see, for example; Current 
Protocols in Molecular Biology, Ausubel F.M. et al . (EDS.) 
Green Publishing Company Assoc. and John Wiley Interscience , 
New York, 1989, 1992) . It is appreciated to one skilled in 
the art that the polynucleotides of SEQ ID NOS:l-14 or 
fragments thereof (comprising at least 12 contiguous 
nucleotides), are particularly useful probes. Other 
particular useful probes for this purpose are hybridizable 
fragments to the sequences of SEQ ID NOS : 1-14 (i.e., 
comprising at least 12 contiguous nucleotides) . 

With respect to nucleic acid sequences which hybridize 
to specific nucleic acid sequences disclosed herein, 
hybridization may be carried out under conditions of reduced 
stringency, medium stringency or even stringent conditions. 
As an example of oligonucleotide hybridization, a polymer 
membrane containing immobilized denatured nucleic acids is 
first prehybridized for 30 minutes at 45 °C in a solution 
consisting of 0.9 M NaCl, 50 mM NaH 2 P0 4 , pH 7.0, 5.0 mM 

NajEDTA^ 0^5% S&S-, 10X Denhardt' s, and 0^5 mg/mL— 

polyriboadenylic acid. Approximately 2 X 10 7 cpm (specific 
activity 4-9 X 10 s cpra/ug) of M P end- labeled oligonucleotide 
probe are then added to the solution. After 12-16 hours of 
incubation, the membrane is washed for 30 minutes at room 
temperature in IX SET (150 mM NaCl, 20 mM Tris hydrochloride, 
pH 7.8, 1 mM Na2EDTA) containing 0.5% SDS, followed by a 30 
minute wash in fresh IX SET at Tm 10 °C for the oligo- 
nucleotide probe . The membrane is then exposed to auto- 
radiographic film for detection of hybridization signals. 

Stringent conditions means hybridization will occur only 
if there is at least 90% identity, preferably at least 95% 
identity and most preferably at least 97% identity between 
the sequences. Further, it is understood that a section of 
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a 100 bps sequence that is 95 bps in length has 95% identity 
with the 1090 bps sequence from which it is obtained. See J. 
Sambrook et al . , Molecular Cloning, A Laboratory Manual, 2d 
Ed., Cold Spring Harbor Laboratory (1989) which is hereby 
incorporated by reference in its entirety. Also, it is 
understood that a fragment of a 100 bps sequence that is 95 
bps in length has 95% identity with the 100 bps sequence from 
which it is obtained. 

As used herein, a first DMA (RNA) sequence is at least 
70% and preferably at least 80% identical to another DNA 
(RNA) sequence if there is at least 70% and preferably at 
least a 80% or 90% identity, respectively, between the bases 
of the first sequence and the bases of the another sequence, 
when properly aligned with each other, for example when 
aligned by BLASTN. 

"Identity" as the term is used herein, refers to a 
polynucleotide sequence which comprises a percentage of the 
same bases as a reference polynucleotide (SEQ ID NOSzl-8) . 
For example, a polynucleotide which is at least 90% identical 
to a reference polynucleotide, has polynucleotide bases which 
are identical in 90% of the bases which make up the reference 
polynucleotide and may have different bases in 10% of the 
bases which comprise that polynucleotide sequence. 

The present invention relates polynucleotides which 
differ from the reference polynucleotide such that the 
changes are silent changes, for example the change do not 
alter the amino acid sequence encoded by the polynucleotide. 
The present invention also relates to nucleotide changes 
which result in amino acid substitutions. additions, 
deletions, fusions and truncations in the polypeptide encoded 
by the reference polynucleotide. In a preferred aspect of 
the invention these polypeptides retain the same biological 
action as the polypeptide encoded by the reference 
polynucleotide . 
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It is also appreciated that euch probes can be and are 
preferably labeled with an analytically detectable reagent to 
facilitate identification of the probe. Useful reagents 
include but are not limited to radioactivity, fluorescent 
dyes or enzymes capable of catalyzing the formation of a 
detectable product. The probes are thus useful to isolate 
complementary copies of DNA from other sources or to screen 
such sources for related sequences. 

The polynucleotides of this invention were recovered 
from genomic gene libraries from the organisms listed in 
Table l. For example, gene libraries can be generated in the 
Lambda ZAP II cloning vector (Stratagene Cloning Systems) . 
Mass excisions can be performed on these libraries to 
generate libraries in the pBluescript phagemid. Libraries 
are thus generated and excisions performed according to the 
protocols /methods hereinafter described. 

The excision libraries are introduced into the E. coli 
strain BW14893 F'kanlA. Expression clones are then 

identified using a high temperature filter assay. Expression 
clones encoding several glucanases and several other 
glycosidases are identified and repurified. The 
polynucleotides, and enzymes encoded thereby, of the present 
invention, yield the activities as described above. 

The coding sequences for the enzymes of the present 
invention were identified by screening the genomic DNAs 
prepared for the clones having glucosidase or galactosidase 
activity . 

An example of such an assay is a high temperature filter 
assay wherein expression clones were identified by use of 
high temperature filter assays using buffer 2 (see recipe 
below) containing 1 mg/ml of the substrate 5 - bromo - 4 - chl or o - 
3-indolyl-£-D-glucopyranoside (XGLO) (Diagnostic Chemicals 
Limited or Sigma) after introducing an excision library into 
the E. coli strain BW14893 F'kanlA. Expression clones 
encoding XGLUases were identified and repurified from M11TL, 
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(1) 



OC1/4V, Pyrococcus furiosus VC1, staphylothemuo marinus Fl , 
Thermococcus 9N-2. Thermotoga maritima MSB 8 , Thermococcus 
alcaliphilus AEDII12RA, and Thermococcus chitonophagus GC74 . 

Z-buffer: (referenced in Miller, J-H. (1992) A Short 
Course in Bacterial Genetics, p. 445.) 

per liter: 

NaiHPO 4 -7H 2 0 16 - lg 

NaH 2 PO«-7H 2 0 5 • 5 9 

KC1 °' 75 9 
MgS0 4 -7H!0 0.246g 
0-mercaptoethanol 2.7ml 

Adjust pH to 7.0 

High Temperature Filter Assay 
The f factor f'Jcan (from E. coli strain CSH118) (1) was 
introduced into the pho-pnh- lac-strain BW14893 (2) . 
BW13893 (2) . The filamentous phage library was plated on 
the resulting strain. BW14893 F'kan. (Miller. J.H. 
(1992) A Short Course in Bacterial Genetics; Lee, K.S., 
Metcalf. et al., (1992) Evidence for two phosphonate 
degradative pathways in Enterobacter Aerogenes. J. 
Bacterid., 174:2501-2510. 

After growth on 100 mm LB plates containing 100 pg/ml 
ampicillin. 80 nethicillin and ImM IPTG, colony 

lifts were performed using Millipore HATF membrane 

filters. . 
The colonies transferred to the filters were lysed with 
chloroform vapor in 150 mm glass petri dishes . 
(4) The filters were transferred to 100 mm glass petri 
dishes containing a piece of Whatman 3MM filter paper 
saturated with buffer. 

(a) when testing for galactosidase activity 
(XGALase) . 3MM paper was saturated with Z buffer 
containing 1 mg/ml XGAL (ChemBridge Corporation) . 
After transferring filter bearing lyBed colonies to 



(2) 



(3) 
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the glass petri dish, placed dish in oven at 80- 
85°C. 

(b) when testing for glucosidase (XGLUase) , 3 MM 
paper was saturated with Z buffer containing 1 
mg/ml XGLU. After transferring filter bearing 
lysed colonies to the glass petri dish, placed dish 
in oven at 80-85°C. 
(5) 'Positives' were observed as blue spots on the filter 
membranes. Used the following filter rescue technique 
to retrieve plasmid from lysed positive colony. Used 
pasteur pipette (or glass capillary tube) to core blue 
spots on the filter membrane. Placed the small filter 
disk in an Eppendorf tube containing 20 pi water. 
Incubated the Eppendorf tube at 75 °C for 5 minutes 
followed by vortexing to elute plasmid DNA off filter. 
This DNA was transformed into electrocompetent JET. cold 
cells DH10B for Thermatoga raaritima MSB8-6G, 
Staphylotherraus marinus F1-12G, Thermococcus AEDII12RA- 
18B/G, Thermococcus chitonophagus GC74-22G, M11T1 and 
OC1/4V. Electrocompetent BW14893 F'kanlA E. coli were 
used for Thermococcus 9N2-31B/G, and Pyrococcus fvurloBus 
Ve3r--7G1- Repeated— f i-lter-Mf t-a-ssay- on— transf formation- 
plates to identify 'positives' . Return transformation 
plates to 37 C C incubator after filter lift to regenerate 
colonies. Inoculate 3 ml LB liquid containing 100 fig/ml 
ampicillin with repurified positives and incubate at 
37°C overnight. Isolate plasmid DNA from these cultures 
and sequence plasmid insert. In some instances where 
the plates used for the initial colony lifts contained 
non-confluent colonies, a specific colony corresponding 
to a blue spot on the filter could be identified on a 
regenerated plate and repurified directly, instead of 
using the filter rescue technique. 

Another example of such an assay is a variation of the 
high temperature filter assay wherein colony- laden filters 
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are heat-killed at different temperatures (for example, 105°C 
for 20 minutes) to monitor thermostability. The 3MM paper is 
saturated with different buffers (i.e., 100 mM NaCl, 5 mM 
MgCl 2 . 100 mM Tris-Cl (pH 9.5)) to determine enzyme activity 
under different buffer conditions. 

A 0-glucosxdase assay may also be employed, wherein 
Glcp/SNp is used as an artificial substrate <aryl-0- 
glucosidase) . The increase in absorbance at 405 nm as a 
result of p-nitrophenol (pNp) liberation was followed on a 
Hitachi U-1100 spectrophotometer, equipped with a 
thermostatted cuvette holder. The assays may be performed at 
80»C or 90»C in closed 1-ml quartz cuvette. A standard 
reaction mixture contains 150 mM trisodium substrate, P H 5.0 
(at 80-C) , and 0.95 mM pNp derivative pNp = 0.S61 mM 1 • cm 1 ) . 
The reaction mixture is allowed to reach the desired 
temperature, after which the reaction is started by injectxng 
an appropriate amount of enzyme (1.06 ml final volume). 

1 U 0-glucosidase activity is defined as that amount 
required to catalyze the formation of 1.0 ftmol pNp/min. D- 
cellobiose may also be used as a substrate. 

An ONPG assay for 0-galactosidase activity is described 
by Miller, J.H. (1992) A Short Course in Bacterial Genetics 
and Mill, J.H. (1992) Experiments in Molecular Genetics, the 
contents of which are hereby incorporated by reference in 

their entirety. 

A quantitative fluorometric assay for 0-galactosxdase 
specific activity is described by : Youngman P., (1987) 
Plasmid Vectors for Recovering and Exploiting Tn917 
Transpositions in Bacillus and other Gram- Positive Bacterxa. 
in Plasmids: A Practical approach (ed. K. Hardy) PP 79-103. 
IRL Press. Oxford. A description of the procedure can be 
found in Miller (1992) p. 75-77, the contents of which are 
incorporated by reference herein in their entirety. 

The polynucleotides of the present invention may be xn 
the form of DNA which DMA includes cDNA, genomic DNA. and 



-18- 



3NSOOCI& <WO 972541 7A1_I_> 



WO 97/25417 



PCT/US97/00092 



synthetic DNA. The DNA may be double- stranded or single- 
stranded, and if single stranded may be the coding strand or 
non-coding (anti-sense) strand. The coding sequences which 
encodes the mature enzymes may be identical to the coding 
sequences shown in Figures 1-8 {SEQ ID NOS:l-8) or may be a 
different coding sequence which coding sequence, as a result 
of the redundancy or degeneracy of the genetic code, encodes 
the same mature enzymes as the DNA of Figures 1-14 (SEQ ID 
NOS: 1-14) . 

The polynucleotide which encodes for the mature enzyme 
of Figures 1-14 (SEQ ID NOS: 15-28) may include, but is not 
limited to; only the coding sequence for the mature enzyme; 
the coding sequence for the mature enzyme and additional 
coding sequence such as a leader sequence or a proprotein 
sequence; the coding sequence for the mature enzyme (and 
optionally additional coding sequence) and non-coding 
sequence , such as introns or non-coding sequence 5 ' and/or 3 ' 
of the coding sequence for the mature enzyme. 

Thus, the term "polynucleotide encoding an enzyme 
(protein) n encompasses a polynucleotide which includes only 
coding sequence for the enzyme as well as a polynucleotide 
which includes lulditionalT coding "and/ or non - coding Sequence . 

The present invention further relates to variants of the 
hereinabove described polynucleotides which encode for 
fragments, analogs and derivatives of the enzymes having the 
deduced amino acid sequences of Figures 1-14 (SEQ ID NOS: 15- 
28) . The variant of the polynucleotide may be a naturally 
occurring allelic variant of the polynucleotide or a non- 
naturally occurring variant of the polynucleotide. 

Thus, the present invention includes polynucleotides 
encoding the same mature enzymes as shown in Figures 1-14 
(SEQ ID NOS:15-28) as well as variants of such 
polynucleotides which variants encode for a fragment, 
derivative or analog of the enzymes of Figures 1-14 (SEQ ID 
NOS: 15-28) . Such nucleotide variants include deletion 
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variants, substitution variants and addition or insertion 
variants . 

As hereinabove indicated, the polynucleotides may have 
a coding sequence which is a naturally occurring allelic 
variant of the coding sequences shown in Figures 1-14 (SEQ ID 
NOS:l-14> . As known in the art, an allelic variant is an 
alternate form of a polynucleotide sequence which may have a 
substitution, deletion or addition of one or more 
nucleotides, which does not substantially alter the function 
of the encoded enzyme. 

Fragments of the full length gene of the present 
invention may be used as a hybridization probe for a cDNA or 
a genomic library to isolate the full length DNA and to 
isolate other DNAs which have a high sequence similarity to 
the gene or similar biological activity. Probes of this type 
preferably have at least 10, preferably at least 15. and even 
more preferably at least 30 bases and may contain, for 
example, at least 50 or more bases. The probe may also be 
used to identify a DNA clone corresponding to a full length 
transcript and a genomic clone or clones that contain the 
complete gene including regulatory and promoter regions, 
exons, and introns. An example of a screen comprises 
isolating the coding region of the gene by using the known 
DNA sequence to synthesize an oligonucleotide probe. Labeled 
oligonucleotides having a sequence complementary to that of 
the gene of the present invention are used to screen a 
library of genomic DNA to determine which members of the 
library the probe hybridizes to. 

The present invention further relates to 
polynucleotides which hybridize to the hereinabove -described 
sequences if there is at least 70%, preferably at least 90%, 
and more preferably at least 95% identity between the 
sequences. The present invention particularly relates to 
polynucleotides which hybridize under stringent conditions to 
the hereinabove-described polynucleotides. As herein used, 
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the term "stringent conditions" means hybridization will 
occur only if there is at least 95% find preferably at least 
97% identity between the sequences. The polynucleotides 
which hybridize to the hereinabove described polynucleotides 
in a preferred embodiment encode enzymes which either retain 
substantially the same biological function or activity as the 
mature enzyme encoded by the DNA of Figures 1-14 (SEQ ID 
NOS: 1-14) . 

Alternatively, the polynucleotide may have at least 15 
bases, preferably at least 30 bases, and more preferably at 
least 50 bases which hybridize to any part of a 
polynucleotide of the present invention and which has an 
identity thereto, as hereinabove described, and which may or 
may not retain activity. For example, such polynucleotides 
may be employed as probes for the polynucleotides of SEQ ID 
NOS: 1-14, for example, for recovery of the polynucleotide or 
as a diagnostic probe or as a PCR primer. 

Thus , the present invention is directed to 
polynucleotides having at least a 70% identity, preferably at 
least 90% identity and more preferably at least a 95% 
identity to a polynucleotide which encodes the enzymes of SEQ 
ID NOS: 15-28 as well as fragments thereof, which fragments 
have at least 15 bases, preferably at least 30 bases and most 
preferably at least 50 bases, which fragments are at least 
90% identical, preferably at least 95% identical and most 
preferably at least 97% identical under stringent conditions 
to any portion of a polynucleotide of the present invention. 

The present invention further relates to enzymes which 
have the deduced amino acid sequences of Figures 1-14 (SEQ ID 
NOS: 15-28) as well as fragments, analogs and derivatives of 
such enzyme. 

The terms "fragment," "derivative" and "analog" when 
referring to the enzymes of Figures 1-14 (SEQ ID NOS: 15-28) 
means enzymes which retain essentially the same biological 
function or activity as such enzymes. Thus, an analog 
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includes a proprotein which can be activated by cleavage of 
the proprotein portion to produce an active mature enzyme. 

The enzymes of the present invention may be a 
recombinant enzyme, a natural enzyme or a synthetic enzyme, 
preferably a recombinant enzyme. 

The fragment, derivative or analog of the enzymes of 
Figures 1-14 (SEQ ID NOS: 15-28) may be (i) one in which one 
or more of the amino acid residues are substituted with a 
conserved or non-conserved amino acid residue (preferably a 
conserved amino acid residue) and such substituted amino acid 
residue may or may not be one encoded by the genetic code, or 
(ii) one in which one or more of the amino acid residues 
includes a substituent group, or (iii) one in which the 
mature enzyme is fused with another compound, such as a 
compound to increase the half -life of the enzyme (for 
example, polyethylene glycol), or (iv) one in which the 
additional amino acids are fused to the mature enzyme, such 
as a leader or secretory sequence or a sequence which is 
employed for purification of the mature enzyme or a 
proprotein sequence. Such fragments, derivatives and analogs 
are deemed to be within the scope of those skilled in the art 
from the teachings herein. 

The enzymeB and polynucleotides of the present invention 
are preferably provided in an isolated form, and preferably 
are purified to homogeneity. 

The term -isolated" means that the material is removed 
from its original environment (e.g., the natural environment 
if it is naturally occurring) . For example, a naturally- 
occurring polynucleotide or enzyme present in a living animal 
is not isolated, but the same polynucleotide or enzyme, 
separated from some or all of the coexisting materials in the 
natural system, is isolated. Such polynucleotides could be 
part of a vector and/or such polynucleotides or enzymes could 
be part of a composition, and still be isolated in that such 
vector or composition is not part of its natural environment. 
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The enzymes of the present invention include the enzymes 
of SEQ ID NOS: 15-28 (in particular the mature enzyme) as well 
as enzymes which have at least 70% similarity (preferably at 
least 70% identity) to the enzymes of SEQ ID NOS: 9-16 and 
more preferably at least 90% similarity (more preferably at 
least 90% identity) to the enzymes of SEQ ID NOS: 15-28 and 
still more preferably at least 95% similarity (still more 
preferably at least 95% identity) to the enzymes of SEQ ID 
NOS: 9-16 and also include portions of such enzymes with such 
portion of the enzyme generally containing at least 3 0 amino 
acids and more preferably at least 50 amino acids. 

As known in the art "similarity" between two enzymes is 
determined by comparing the amino acid sequence and its 
conserved amino acid substitutes of one enzyme to the 
sequence of a second enzyme. 

A variant , i.e. a " fragment " , " analog" or " derivative " 
polypeptide, and reference polypeptide may differ in amino 
acid sequence by one or more substitutions, additions, 
deletions, fusions and truncations, which may be present in 
any combination. 

Among preferred variants are those that vary from a 
reference by conservative amino acid substitutions , Such 
substitutions are those that substitute a given amino acid in 
a polypeptide by another amino acid of like characteristics . 
Typically seen as conservative substitutions are the 
replacements, one for another, among the aliphatic amino 
acids Ala, Val, Leu and lie; interchange of the hydroxyl 
residues Ser and Thr, exchange of the acidic residues Asp and 
Glu, substitution between the amide residues Asn and Gin, 
exchange of the basic residues Lys and Arg and replacements 
among the aromatic residues Phe, Tyr. 

Most highly preferred are variants which retain the same 
biological function and activity as the reference polypeptide 
from which it varies - 
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Fragments or portions of the enzymes of the present 
invention may be employed for producing the corresponding 
full-length enzyme by peptide synthesis; therefore, the 
fragments may be employed as intermediates for producing the 
full-length enzymes. Fragments or portions of the 

polynucleotides of the present invention may be used to 
synthesize full-length polynucleotides of the present 
invention . 

The present invention also relates to vectors which 
include polynucleotides of the present invention, host cells 
which are genetically engineered with vectors of the 
invention and the production of enzymes of the invention by 
recombinant techniques. 

Host cells are genetically engineered (transduced or 
transformed or transfected) with the vectors of this 
invention which may be, for example, a cloning vector or an 
expression vector. The vector may be, for example, m the 
form of a plasmid, a viral particle, a phage, etc. The 
engineered host cells can be cultured in conventional 
nutrient media modified as appropriate for activating 
promoters, selecting transf ormants or amplifying the genes of 
the present invention. The culture conditions, such as 
temperature. pH and the like, are those previously used wxth 
the host cell selected for expression, and will be apparent 
to the ordinarily skilled artisan. 

The polynucleotides of the present invention may be 
employed for producing enzymes by recombinant techniques. 
Thus, for example, the polynucleotide may be included in any 
one of a variety of expression vectors for expressing an 
enzyme. Such vectors include chromosomal, nonchromosomal and 
synthetic DMA sequences, e.g., derivatives of SV40; bacterial 
plasmids,- phage DNA; baculovirus yeast plasmids; vectors 
derived from combinations of plasmids and phage DNA. viral 
DNA such as vaccinia, adenovirus, fowl pox virus, and 
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peeudorabiee . However, any other vector may be used as long 
as it is replicable and viable in the host. 

The appropriate DNA sequence may be inserted into the 
vector by a variety of procedures . In general , the DNA 
sequence is inserted into an appropriate restriction 
endonuclease site(s) by procedures known in the art. Such 
procedures and others are deemed to be within the scope of 
those skilled in the art. 

The DNA sequence in the expression vector is operatively 
linked to an appropriate expression control sequence ( s ) 
(promoter) to direct mRKA synthesis. As representative 
examples of such promoters , there may be mentioned: LTR or 
SV4 0 promoter, the E. coli. lac or trp , the phage lambda P L 
promoter and other promoters known to control expression of 
genes in prokaryotic or eukaryotic cells or their viruses. 
The expression vector also contains a ribosome binding site 
for translation initiation and a transcription terminator. 
The vector may also include appropriate sequences for 
amplifying expression. 

In addition, the expression vectors preferably contain 
one or more selectable marker genes to provide a phenotypic 

trait — for — selection — of — transformed — host — cells — such — as — 

dihydrof olate reductase or neomycin resistance for eukaryotic 
cell culture, or such as tetracycline or ampicillin 
resistance in E . coli . 

The vector containing the appropriate DNA sequence as 
hereinabove described, as well as an appropriate promoter or 
control sequence, may be employed to transform an appropriate 
host to permit the host to express the protein. 

As representative examples of appropriate hosts, there 
may be mentioned : bacterial cells , such as E. coli , 
Streptomvces , Bacillus subtilis : fungal cells, such as yeast; 
insect cells such as Drosonhila S2 and Spodoptera Sf 9 ; animal 
cells such as GHO, COS or Bowes melanoma; adenoviruses; plant 
cells, etc. The selection of an appropriate host is deemed 
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to be within the scope of those skilled in the art from the 
teachings herein . 

More particularly, the present invention also includes 
recombinant constructs comprising one or more of the 
sequences as broadly described above. The constructs 
comprise a vector, such as a plasmid or viral vector, into 
which a sequence of the invention has been .inserted, in a 
forward or reverse orientation. In a preferred aspect of this 
embodiment, the construct further comprises regulatory 
sequences, including, for example, a promoter, operably 
linked to the sequence. Large numbers of suitable vectors 
and promoters are known to those of skill in the art, and are 
commercially available. The following vectors are provided 
by way of example; Bacterial: pQE70, pQESO. pQE-9 (Qiagen) , 
pDIO, psiX174. pBluescript II KS, pNHSA, pNH16a, pNH18A, 
pNH46A (Stratagene) ; ptrc99a. pKK223-3, pKK233-3, pDR540, 
pRITS (Pharmacia); Bukaryotic: pSV2CAT, pOG44. pXTl. pSG 
(Stratagene) pSVK3 , pBPV, pMSG, pSVL (Pharmacia). However, 
any other plasmid or vector may be used as long as they are 
replicable and viable in the host. 

Promoter regions can be selected from any desired gene 
using CAT (chloramphenicol transferase) vectors or other 
vectors with selectable markers. Two appropriate vectors are 
PKK232-8 and P CM7. Particular named bacterial promoters 
include lad. lacZ, T3, T7. gpt, lambda P R , P L and trp. 
Eukaryotic promoters include CMV immediate early, HSV 
thymidine kinase, early and late SV40, LTRs from retrovirus, 
and mouse metallothionein-I . Selection of the appropriate 
vector and promoter is well within the level of ordinary 

skill in the art. 

In a further embodiment, the present invention relates 
to host cells containing the above-described constructs. The 
host cell can be a higher eukaryotic cell, such as a 
mammalian cell, or a lower eukaryotic cell, such as a yeast 
cell, or the host cell can be a prokaryotic cell, such as a 
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bacterial cell. Introduction of the construct into the hoot 
cell can be effected by calcium phosphate transf ection, DEAE- 
Dextran mediated transf ection, or electroporation {Davis, L. r 
Dibner, M . , Battey, I., Basic Methods in Molecular Biology, 
(1986) ) . 

The constructs in host cells can be used in a 
conventional manner to produce the gene product encoded by 
the recombinant sequence. Alternatively, the enzymes of the 
invention can be synthetically produced by conventional 
peptide synthesizers . 

Mature proteins can be expressed in mammalian cells, 
yeast, bacteria, or other cells under the control of 
appropriate promoters- Cell- free translation systems can 
also be employed to produce such proteins using RNAs derived 
from the DNA constructs of the present invention. 
Appropriate cloning and expression vectors for use with 
prokaryotic and eukaryotic hosts are described by Sambrook , 
et al . , Molecular Cloning: A Laboratory Manual, Second 
Edition, Cold Spring Harbor, N.Y» , (1989), the disclosure of 
which is hereby incorporated by reference. 

Transcription of the DNA encoding the enzymes of the 
present invention by higher eukaryotes is increased by 
inserting an enhancer sequence into the vector. Enhancers 
are cis-acting elements of DNA, usually about from 10 to 300 
bp that act on a promoter to increase its transcription. 
Examples include the SV40 enhancer on the late side of the 
replication origin bp 100 to 270 , a cytomegalovirus early 
promoter enhancer, the polyoma enhancer on the late side of 
the replication origin, and adenovirus enhancers. 

Generally, recombinant expression vectors will include 
origins of replication and selectable markers permitting 
transformation of the host cell, e*g., the ampicillin 
resistance gene of E. coli and S. cerevisiae TRP1 gene, and 
a promoter derived from a highly-expressed gene to direct 
transcription of a downstream structural sequence. Such 
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promoters can be derived from operons encoding glycolytic 
enzymes such as 3 -pnoephoglycerate kinase (PGK) «-f actor, 
acid phosphatase, or heat shock proteins, among others. The 
heterologous structural sequence is assembled in appropriate 
phase with translation initiation and termination sequences, 
and preferably, a leader sequence capable of directing 
secretion of translated enzyme. Optionally, the heterologous 
sequence can encode a fusion enzyme including an N- terminal 
identification peptide imparting desired characteristics, 
e.g., stabilization or simplified purification of expressed 

recombinant product. 

Useful expression vectors for bacterial use are 
constructed by inserting a structural DNA sequence encodxng 
a desired protein together with suitable translation 
initiation and termination signals in operable reading phase 
with a functional promoter. The vector will comprise one or 
m ore phenotypic selectable markers and an origin of 
replication to ensure maintenance of the vector and to, if 
desirable, provide amplification within the host. Suitable 
prokaryotic hosts for transformation include E. co^x 
L^-m,,. g „htilis. Salmonella typhimurium and various species 
within the genera Pseudomonas, Streptomyces and 
Staphylococcus, although others may also be employed as a 

matter of choice. i 
As a representative but nonlimiting example, useful 

expression vectors for bacterial use can comprise a 
selectable marker and bacterial origin of replication derived 
from commercially available plasmids ^^J^" 
elements of the well known cloning vector pBR322 ^TCC 
37017). Such commercial vectors include 

PKK223-3 (Pharmacia Fine Chemicals, Uppsala, Sweden) and ^GEMl 
Iomega Biotec, Madison, Ml. USA). These P BR322 "^one 

sections are combined with an appropriate promoter and the 

structural sequence to be expressed. 
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Following transformation of a suitable host strain and 
growth of the host strain to an appropriate cell density, the 
selected promoter is induced by appropriate means (e.g. , 
temperature shift or chemical induction) and cells are 
cultured for an additional period. 

Cells are typically harvested by centrif ugation, 
disrupted by physical or chemical means, and the resulting 
crude extract retained for further purification. 

Microbial cells employed in expression of proteins can 
be disrupted by any convenient method, including freeze -thaw 
cycling, sonication, mechanical disruption, or use of cell 
lysing agents, such methods are well known to those skilled 
in the art. 

Various mammalian cell culture systems can also be 
employed to express recombinant protein. Examples of 
mammalian expression systems include the COS -7 lines of 
monkey kidney fibroblasts, described by Gluzman, Cell, 23:175 
(1981) , and other cell lines capable of expressing a 
compatible vector, for example, the C127, 3T3 , CHO, HeLa and 
BHK cell lines . Mammalian expression vectors will comprise 
an origin of replication, a suitable promoter and enhancer, 
and also any necessary ribosome binding sites, 
polyadenylation site , splice donor and acceptor sites , 
transcriptional termination sequences, and 5' flanking 
nontranscribed sequences . DNA sequences derived from the 
SV40 splice, and polyadenylation sites may be used to provide 
the required nontranscribed genetic elements . 

The enzyme can be recovered and purified from 
recombinant cell cultures by methods including ammonium 
sulfate or ethanol precipitation, acid extraction, anion or 
cation exchange chromatography, phosphocellulose 
chromatography, hydrophobic interaction chromatography, 
affinity chromatography, hydroxylapatite chromatography and 
lectin chromatography. Protein refolding steps can be used, 
as necessary, in completing configuration of the mature 
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protein. Finally. high performance liquid chromatography 
(HPLC) can be employed for final purification steps. 

The enzymes of the present invention may be a naturally 
purified product. or a product of chemical synthetic 
procedures, or produced by recombinant techniques from a 
prokaryotic or eukaryotic host (for example, by bacterial, 
yeast, higher plant, insect and mammalian cells in culture) . 
Depending upon the host employed in a recombinant production 
procedure, the enzymes of the present invention may be 
glycosylated or may be non-glycosylated . Enzymes of the 
invention may or may not also include an initial methionine 

amino acid residue. 

8-galactosidaee hydrolyzes lactose to galactose and 
glucose. Accordingly, the OCX/4V. M.Z-31B/G. AEDII12RA-X8B/G 
Ld F1-12G enzymes -ay be employed in the food P™"« 
industry for the production of low lactose content ^milk and 
for the production of galactose or glucose from lactose 
contained in whey obtained in a large amount as a ^P™^' 
in the production of cheese. Generally, it is 
enzymes used in food processing, such as the aforementioned 
0-galactoeidases, be stable at elevated temperatures to help 
prevent microbial contamination. 

These enzymes may also be employed in the pharmaceutical 
industry. The enzymes are used to treat Z 
lactose, In this case, a thermostable enzyme is desired, as 
well. Thermostable 0-galactosidases also have uses in 
diagnostic applications, where they are employed as reporter 

"^Tlucosidases act on soluble cellooligosaccharides from 
the non-reducing end to give glucose as the sole P™*~ £ 
Glucanases (endo- and exo-) act in the ^Vmerizatio^of 
cellulose. generating more non-reducing ends <endo 
Tlucanases. for instance, act on internal linkages yielding 
cellobiose. glucose and cellooligosaccharides as 
0-glucoeidases are used in applications where glucose is the 
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desired product. Accordingly, M11TL, F1-12G, GC74-22G and 
MSB8-6G (and OC1/4V, VC1-7G1, 9N2-31B/G and AEDII12RA18B/G) 
may be employed in a wide variety of industrial applications, 
including in corn wet milling for the separation of starch 
and gluten, in the fruit industry for clarification and 
equipment maintenance, in baking for viscosity reduction, in 
the textile industry for the processing of blue jeans, and in 
the detergent industry as an additive. For these and other 
applications, thermostable enzymes are desirable. 

Antibodies generated against the enzymes corresponding 
to a sequence of the present invention can be obtained by 
direct injection of the enzymes into an animal or by 
administering the enzymes to an animal, preferably a 
nonhuman. The antibody so obtained will then bind the 
enzymes itself . In this manner, even a sequence encoding 
only a fragment of the enzymes can be used to generate 
antibodies binding the whole native enzymes. Such antibodies 
can then be used to isolate the enzyme from cells expressing 
that enzyme . 

For preparation of monoclonal antibodies, any technique 
which provides antibodies produced by continuous cell line 
cultures can be used. Examples include the hybridoma 
technique (Kohler and Milstein, 1975, Nature, 256:495-497), 
the trioma technique, the human B-cell hybridoma technique 
(Kozbor et al., 1983, Immunology Today 4:72), and the EBV- 
hybridoma technique to produce human monoclonal antibodies 
(Cole, et al., 1985 , in Monoclonal Antibodies and Cancer 
Therapy, Alan R. Liss, Inc., pp. 77-96). 

Techniques described for the production of single chain 
antibodies (U.S. Patent 4,946,778) can be adapted to produce 
single chain antibodies to immunogenic enzyme products of 
this invention. Also, transgenic mice may be used to express 
humanized antibodies to immunogenic enzyme products of this 
invention. 
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Antibodies generated against the enzyme of the present 
invention may be used in screening for similar enzymes from 
other organisms and samples. Such screening techniques are 
known in the art. for example, one such screening assay is 
described in "Methods for Measuring Cellulase Activities- . 
Methods in enzymology. Vol 160, PP . 87-116. which is hereby 
incorporated by reference in its entirety. 

The present invention will be further described with 
reference to the following examples; however, it is to be 
understood that the present invention is not limited to such 
examples. All parts or amounts, unless otherwise specified, 

are by weight. 

In order to facilitate understanding of the following 
examples certain frequently occurring methods and/or terms 

will be described. 

-Plasmids" are designated by a lower case p preceded 
and/or followed by capital letters and/or numbers The 
starting plasmids herein are either commercially available, 
publicly available on an unrestricted basis, or can be 
constructed from available plasmids in accord with publxshed 
procedures. In addition, equivalent plasmids to those 
described are known in the art and will be apparent to the 
ordinarily skilled artisan. 

-Digestion- of DNA refers to catalytic cleavage of the 
DNA with a restriction enzyme that acts only at certaxn 
sequences in the DNA. The various restriction enzymes used 
herein are commercially available and their reaction 
conditions, cof actors and other requirements were used as 
would be known to the ordinarily skilled artxsan. For 
analytical purposes, typically l of plasmxd or DNA 

fragment is used with about 2 unite of enzyme xn about 20 £ 
of buffer solution. For the purpose of isolating DNA 
fragments for plasmxd construction, typically 5 to 50 M9 °t 
DNA are digested with 20 to 250 units of enzyme in a larger 
volume. Appropriate buffers and substrate amounts for 
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particular restriction enzymes are specified by the 
manufacturer. Incubation times of about 1 hour at 37 *C are 
ordinarily used, but may vary in accordance with the 
supplier' s instructions . After digestion the reaction is 
elect rophoresed directly on a polyacrylamide gel to isolate 
the desired fragment . 

Size separation of the cleaved fragments is performed 
using 8 percent polyacrylamide gel described by Goeddel, D. 
et al., Nucleic Acids Res., 8:4057 (1980). 

"Oligonucleotides" refers to either a single stranded 
polydeoxynucleotide or two complementary polydeoxynucleotide 
strands which may be chemically synthesized. Such synthetic 
oligonucleotides have no 5 ' phosphate and thus will not 
ligate to another oligonucleotide without adding a phosphate 
with an ATP in the presence of a kinase . A synthetic 
oligonucleotide will ligate to a fragment that has not been 
dephosphorylated . 

"Ligation" refers to the process of forming 
phosphodiester bonds between two double stranded nucleic acid 
fragments (Maniatis, T., et al., Id., p. 146). Unless 
otherwise provided, ligation may be accomplished using known 
buffers and conditions with 10 units of T4 DNA ligase 
("ligase") per 0.5 fig of approximately equimolar amounts of 
the DNA fragments to be ligated. 

Unless otherwise stated, transformation was performed as 
described in the method of Graham, F. and Van der Eb, A., 
Virology, 52:456-457 (1973). 

Example 1 

Bacterial Expression and Purification of Glvcosidase Enzymes 
DNA encoding the enzymes of the present invention, SEQ 
ID NOS:l through 8, were initially amplified from a 
pBluescript vector containing the DNA by the PCR technique 
using the primers noted herein. The amplified sequences were 
then inserted into the respective PQE vector listed beneath 
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t he primer sequences, and the enzyme was expressed according 
o the Protocols set forth herein. The 5< and 3' primer 
sequences for the respective genes are as follows: 



yCCUB AEDII12RA -1BB/G 

\GAATTCATTAj^GAGGAGAAATTAACTATGGTGAATGCTATGATTGTC 



Thermococcua AEDII12RA -1BB/G 
5' CCGAC 

(SEQ ID NO:29) 
3- CGGAAGATCTTCATAGCTCCGGAAGCCCATA (SEQ ID NO:30) 

Vector: PQE12; and contains the following restriction enzyme 
sites 5' EcoRI and 3' Big II. 

(SEQ ID NO: 31) 

3 . CGGAAGATCTTTAAGATTTTAGAAATTCCTT (SEQ ID HO = 32) 
VectoT^, contains the following -tr.cc.cn enxy*. 

sites 5' EcoRI and 3' Bgl II. 



ID NO: 33) „^ __ % 

CGGAGGTACCTCACCCAAGTCCGAACTTCTC (SEQ ID NO: 34) 

!r^E30; and contains the following restriction enzyme 



5' 

(SEQ ID NO: 33) 
3 

Vector: pQE30; 
sites 5' EcoRI and 3' Kpnl 



(SEQ ID NO: 35) 

CGGAAGATCTTTATTCGAGGTTCTTTAATCC (SEQ ID HO=3«) 

LTX. ^ contains tne following restriction 
sites 5' EcoRI and 3' Bgl II. 

Thennococcue chi tonophagua GC74 - 22G _^ Af , aaf ^rrrcTC 
I- CCGAGAATTCATTCATTAAAGAGGAGAAATTAACTATGCTTC 

(SEQ ID NO: 37) 
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3 ' CGGAGGATCCCTACCCCTCCTCTAAGATCTC { SEQ ID NO : 3 8 ) 

Vector: pQE12; and contains the following restriction enzyme 

sites 5' EcoRI and 3' BamHI . 

M11TL 

5 ' AATAATCTAGAGCATGCAATTCCCCAAAGACTTCATGATAG ( SEQ ID NO : 3 9 ) 
3 ' AATAAAAGCTTACTGGATCAGTGTAAGATGCT (SEQ ID NO: 40) 
Vector: pQE70; and contains the following restriction enzyme 
sites 5' SphI and 3' Hind III. 

Thermotoga mari tima MSB8-6G 

5 ' CCGACAATTGATTAAAGAGGAGAAATTAACTATGGAAAGGATCGATGAAATT 
(SEQ ID NO:41) 

3' CGGAGGTACCTCATGGTTTGAATCTCTTCTC (SEQ ID NO: 42) 

Vector: pQE12 ; and contains the following restriction enzyme 

sites 5' EcoRI and 3' Kpnl . 

Pyrococcus furxoBiis VC1 - 7G1 

5' CCGACAATTGATTAAAGAGGAGAAATTT^CTATGTTCCCTGAAAAGTTCCTT 
(SEQ ID NO:43) 

3' CGGAGGTACCTCATCCCCTCAGCAATTCCTC (SEQ ID NO: 44) 

Vector: pQE12; and contains the following restriction enzyme 

sites 5' EcoRI and 3' Kpn I. 

Bankla gouldi endoglucanase (37GP1) 

5 ' AATAAGGATCCGTTTAGCGACGCTCGC 
(SEQ ID NO: 45) 

3' AATAAAAGCTTCCGGGTTGTACAGCGGTAATAGGC (SEQ ID NO: 46) 

Vector: pQE52; and contains the following restriction enzyme 
sites 5' Bam HI and 3' Hind III. 

Thexmotoga maritima a-galactosidase (6GC2) 

5 ' TTTATTGAATTCATTAAAGAGGAGAAATTAACTATGATCIOT 
(SEQ ID NO:47) 

3' TCTATAAAGCTTTCATTCTCTCTCACCCICTT (SEQ ID NO: 4 8) 
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vector: pQET; and contains the following restriction enzyme 
sites 5' EcoRI and 3' Hind III. 

Thermotoga ma^itima S-mannanase (6GP2) 

5' TTTATTCAATTGATTAAAC^GGAGAAATTAACTATGGGGATTGGTGGCGACGAC 
(SEQ ID NO:49> 

3, TTTATTAAGCTTATCTTTTCATATTCACRTACCTCC (SEQ ID NO:50) 
Vector: pQEt ; and contains the following restriction enzyme 
sites 5' Hind III and 3' EcoRI - 
AEPII la S-mannanase (63GB1) 

5 , TTT ATTGAATT CATT AAAGAGGAGAAATTAACT ATG CTACCAGAAGAGTTCCT ATGGGGC 

(SEO ID NO: 51) 
, TTTATTAAGCTTCTCATCAACGGCTATGGTCTTCATTTC (SEQ ID NO : 5 2 ) 

Vector: pQEt; and contains the following restriction enzyme 
sites 5' Hind III and 3' EcoRI. 
nri/4V endoglucanase (33GP1) 

(SEQ ID NO-.53) 

3. ^TTTTCGGATCCAATTCTTCATTEACTCTTTGCCTG < SEQ ID »0 = 54 > 

Vector^t, ana confine the foUowing reetrictxon enzyme 
sites 5' BamHI and 3' EcoRI. 

rnermotoga maritime pullalanase (6GP3) 

S' TTTTGGAATTCATTAAAGAGGAGAAATTAACTATGGAACTGATCATAGAAGGTTAC 

(SEQ ID NO: 55) 

ATAAGAAGCTTTTCACTCTCTGTAC/iGAJlCGTRCGC (SEQ ID HO=56) 

LtoT^t, ana contains the following restriction enzyme 
sites S' EcoRI ana 3' Hina III. 

The restriction enzyme sites indicatea correspond to the 
friction enzyme sites on the serial 

indicated for the respective gene (Qxagen. Inc. ^~« h ; 

The pQE vector encodes antibiotic resistance CM* > . * 
"LteriaX Origin of replication (ori, . an IPTC-regulataWe 
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promoter operator (P/O) . a ribosome binding site (RBS) a 6- 
His tag and restriction enzyme sites. 

The pQE vector was digested with the restriction enzymes 
indicated. The amplified sequences were ligated into the 
respective pQE vector and inserted in frame with the sequence 
encoding for the RBS. The ligation mixture was then used to 
transform the b. coli strain M15/pREP4 (Qiagen. Inc.) by 
electroporation. M15/pREP4 contains multiple copies of the 
plasmid pREP4. which expresses the lad repressor and also 
confers kanamycin resistance (Kan'). Transf ormants were 
identified by their ability to grow on LB plates and 
ampicillin/kanamycin resistant colonies were selected. 
Plasmid DNA was isolated and confirmed by restriction 
analysis. Clones containing the desired constructs were 
grown overnight (O/N) in liquid culture in LB media 
supplemented with both Amp (100 ug/ml) and Kan (25 ug/ml) . 
The O/N culture was used to inoculate a large culture at a 
ratio of 1:100 to 1:250. The cells were grown to an optical 
density 600 (O.D."> of between 0.4 and 0.6. IPTG 
(-Iso P ropyl-B-D-thiogalacto pyranoside") was then added to a 
final concentration of 1 »M. IPTG induces by inactivating 
the la d repreii^-*OTgi*^^ 

gene expression. Cells were grown an extra 3 to 4 hours. 
Cells were then harvested by centrifugation. 

The primer sequences set out above may also be employed 
to isolate the target gene from the deposited material by 
hybridization techniques described above. 

Example 2 

r^^inn » d rion« From the Pepo*^ genomic 

clones 

A clone is isolated directly by screening the 
deposited material using the oligonucleotide primers set 
forth in Example 1 for the particular gene desired to be 
isolated. The specific oligonucleotides are synthesized 
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using an Applied Biosystems DNA synthesizer. The 
oligonucleotides are labeled with »P- -ATP using T4 
polynucleotide kinase and purified according to a standard 
protocol (Maniatis et al . , Molecular Cloning: A Laboratory 
Manual, Cold Spring Harbor Press, Cold Spring, NY, 1982). 
The deposited clones in the pBluescript vectors may be 
employed to transform bacterial hosts which are then plated 
on 1.5% agar plates to the density of 20,000-50.000 
pfu/150 nun plate. These plates are screened using Nylon 
ro embranes according to the standard screening protocol 
(Stratagene. 1993) . Specifically, the Nylon membrane wxth 
denatured and fixed DNA is prehybridized in 6 x SSC, 20 xnM 
N aH 2 P0 4 . 0.4%SDS, 5 x Denhardt's 500 pg/ml denatured, 
S onicated salmon sperm DNA,- and 6 x SSC, 0.1% SDS. After 
one hour of prehybridization, the membrane is 
with hybridization buffer 6xSSC. 20 mM NaH.PO., 0.4%SDS, 500 
ug/ml denatured, sonicated salmon sperm DNA wxth 1x10 
cpm/ml »P-probe overnight at 42-C. The membrane xs washed 
^ 45-50-C with washing buffer 6 x SSC, 0.1% SDS for 20-30 
minutes dried and exposed to Kodak X-ray film 
Positive clones are isolated and purified by secondary and 
tertiary screening. The purified clone is sequenced to 
verify its identity to the primer sequence. 

Once the clone is isolated, the two oligonucleotide 
primers corresponding to the gene of interest are 
Iplify the gene from the deposited material. A Phrase 
chain reaction is carried out in 25 ,1 of reactxon mxxture 
with 0.5 ug of the DNA of the gene of interest. The 
reaction mixture is 1.5-5 mM MgCl,, 0.01% <w/v> gelatx, , 20 
pcM each of dATP, dCTP, dGTP, dTTP, 25 pmol of each prxmer 
Id 0.25 Unit of Taq polymerase. Thirty five cycles of PGR 
(denaturation at 94-C for 1 min; annealing at * 
„in; elongation at 72-C for 1 min) are performed wxth the 
Perkin-Elmer Cetus automated thermal cycler. The amplxfxed 
product is analyzed by agarose gel electrophoresxs and the 
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DNA band with expected molecular weight is excised and 
purified. The PCR product iB verified to be the gene of 
interest by subcloning and sequencing the DNA product. The 
ends of the newly purified genes are nucleotide sequenced 
to identify full length sequences. Complete sequencing of 
full length genes is then performed by Exonuclease III 
digestion or primer walking. 

Example 3 

Screening for Galactosidase Activity 
Screening procedures for a-galactosidase protein 
activity may be assayed for as follows: 

Substrate plates were provided by a standard plating 
procedure. Dilute XLl-Blue MRF E coli host of (Stratagene 
Cloning Systems, La Jolla. CA) to O.D.„o =1-0 with NZY 
media. In 15 ml tubes, inoculate 200 M l diluted host cells 
with phage. Mix gently and incubate tubes at 37 °C for 15 
min. Add approximately 3.5 ml LB top agarose (0.7%) 
containing ImM IPTG to each tube and pour onto all NYZ 
plate surface. Allow to cool and incubate at 37 °C 
overnight . The assay plates are obtained as substrate p- 
Nitrophenyl a-galactosidase (Sigma) (200 mg/100 ml) (100 mM 
NaCl, 100 mM Potassium- Phosphate) 1% (w/v) agarose. The 
plaques are overlayed with nitrocellulose and incubated at 
4 "C for 30 minutes whereupon the nitrocellulose is removed 
and overlayed onto the substrate plates. The substrate 
plates are then incubated at 70 °C for 20 minutes. 

Kxamnle 4 

cnrAOT^na of Clones for m»™ij»tiji w*» Activity 
A solid phase screening assay was utilized as a 
primary screening method to test clones for S-mannanase 
activity. 

A culture solution of the Y1090-E. coli host strain 
(Stratagene Cloning Systems, La Jolla, CA) was diluted to 
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O D. «)o=l . 0 with NZY media. The amplified library from 
Thermotoga maritima lambda gtll library was diluted in SM 
(phage dilution buffer) : 5 x 10 7 pfu/ M l diluted 1:1000 
then 1:100 to 5 x 10 J pfu/f.1. Then 8 M l of phage dilution 
(5 x 10 2 pfu//xD was plated in 200 M l host cells. They 
were then incubated in 15 ml tubes at 37 -C for 15 minutes. 

Approximately 4 ml of molten. LB top agarose (0.7%) at 
approximately 52 -C was added to each tube and the mixture 
was poured onto the surface of LB agar plates . The agar 
plates were then incubated at 37 «>C for five hours . The 
plates were replicated and induced with 10 mM IPTG-soaked 
Duralon-UV™ nylon membranes (Stratagene Cloning Systems. 
u J0 u a , CA) overnight. The nylon membranes and plates 
were marked with a needle to keep their orientation and the 
nylon membranes were then removed and stored at 4 -C. 

An Azo-galactomannan overlay was applied to the LB 
plates containing the lambda plaques. The overlay contains 
1% agarose. 50 mM potassium-phosphate buffer P H 7 . 0.4% 
Azocarob-galactomannan. (Megazyme, Australia) . The plates 
were incubated at 72 *C. The Azocarob-galactomannan 
treated plates were observed after 4 hours then returned to 
incubation overnight. Putative positives were identified 
by clearing zones on the Azocarob-galactomannan plates. 
Two positive clones were observed. 

The nylon membranes referred to above, which 
correspond to the positive clones were retrieved, oriented 
over the plate and the portions matching the locations of 
the clearing zones for positive clones wre cut out. Phage 
was eluted from the membrane cut-out portions by soaking 
the individual portions in 500 „1 SM (phage dilution 
buffer) and 25 m 1 CHC1 3 . 



Example_5 

e..»n<n fl ~* Planes f ~ f"" , ""^' ,flB Activity 
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A solid phase screening assay was utilized as a 
primary screening method to test clones for fi-mannosidase 
activity. 

A culture solution of the Y1090-£. coli host strain 
(Stratagene Cloning Systems, La Jolla, CA) was diluted to 
O.D. ot^I.O with NZY media. The amplified library from 
AEPII la lambda gtll library was diluted in SM (phage 
dilution buffer): 5 x 10 7 pfu//il diluted 1:1000 then 1:100 
to 5 x 10 2 pfu/^1. Then 8 /il of phage dilution 
(5 x 10 2 pfu//il) was plated in 200 pi host cells. They 
were then incubated in 15 ml tubes at 37 °C for 15 minutes. 

Approximately 4 ml of molten, LB top agarose (0.7%) at 
approximately 52 °C was added to each tube and the mixture 
was poured onto the surface of LB agar plates. The agar 
plates were then incubated at 37 °C for five hours. The 
plates were replicated and induced with 10 mM IPTG- soaked 
Duralon-DV™ nylon membranes (Stratagene Cloning Systems, 
La Jolla, CA) overnight. The nylon membranes and plates 
were marked with a needle to keep their orientation and the 
nylon membranes were then removed and stored at 4 °C. 

A p - ni t r ophe ny 1 - & - D - manno - pyrano side overlay was 
applied to the LB plates containing the lambda plaques. 
The overlay contains 1% agarose, 50 mM potassium-phosphate 
buffer pH 7, 0.4V p-nitrophenyl-£-D-manno-pyranoside . 
(Megazyrae, Australia) . The plates were incubated at 72 °C. 
The p-nitrophenyl-B-D-manno-pyranoside treated plates were 
observed after 4 hours then returned to incubation 
overnight. Putative positives were identified by clearing 
zones on the p-nitrophenyl-B-D-manno-pyranoside plates. 
Two positive clones were observed. 

The nylon membranes referred to above, which 
correspond to the positive clones were retrieved, oriented 
over the plate and the portions matching the locations of 
the clearing zones for positive clones wre cut out. Phage 
was eluted from the membrane cut-out portions by soaking 
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the individual portions in 500 fil SM (phage dilution 
buffer) and 25 y.1 CHC1 3 . 

Example 6 
Screening for Pullulanas e Activity 
Screening procedures for pullulanase protein activity 
may be assayed for as follows: 

Substrate plates were provided by a standard plating 
procedure. Host cells are diluted to O.D.^ --1.0 with NZY 
or appropriate media. In 15 ml tubes, inoculate 200 M l 
diluted host cells with phage. Mix gently and incubate 
tubes at 37 °C for 15 min. Add approximately 3.5 ml LB top 
agarose (0.7%) is added to each tube and the mixture is 
plated, allowed to cool, and incubated at 37-C for about 28 
hours. Overlays of 4.5 mis of the following substrate are 

poured : 

mo ml t" f"i volume 

0.5g Red Pullulan Red (Megazyme, Australia) 

1 . Og Agarose 

5m l Buffer (Tris-HCL pH 7.2 ® 75 °C) 

2ml 5M NaCl 

5ml CaCl 2 (lOOmM) 

85ml dH 2 0 

Plates are cooled at room temperature, and thenm incubated 
at 7S»C for 2 hours. Positives are observed as showing 
substrate degradation. 

Example 7 

cnr« > ? ninq for End oalucapanP Activity 
Screening procedures for endoglucanaee protein 

activity may be assayed for as follows: 

l. The gene library is plated onto 6 LB/GelRite/0 -IV 

CMC/NZY agar plates (-4,800 plague forming units/plate) in 

E.coli host with LB agarose as top agarose. The plates are 

incubated at 37°C overnight. 
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2. Plates are chilled at 4°C for one hour. 

3. The plates are overlayed with Duralon membranes 
(Stratagene) at room temperature for one hour and the 
membranes are oriented and lifted off the plates and stored 
at 4°C. 

4. The top agarose layer is removed and plates are 
incubated at 37°C for -3 hours. 

5. The plate surface is rinsed with NaCl. 

6. The plate is stained with 0.1% Congo Red for 15 
minutes . 

7. The plate is destained with 1M NaCl. 

8. The putative positives identified on plate are 
isolated from the Duralon membrane (positives are 
identified by clearing zones around clones) - The phage is 
eluted from the membrane by incubating in SOOfil SM + 25/xl 

CHC1 3 to elute. 

9. Insert DNA is subcloned into any appropriate 
cloning vector and subclones are reassayed for CMCase 
activity using the following protocol: 

i) Spin lml overnight miniprep of clone at 
maximum spe ed f o r 3 minutes 



ii) Decant the supernatant and use it to fill 
"wells" that have been made in an IiB/GelRite/0 . 1% CMC 



plate. 



iii) incubate at 37 °C for 2 hours. 

iv) Stain with 0.1% Congo Red for 15 minutes. 

v) Destain with 1M NaCl for 15 minutes. 

vi) Identify positives by clearing zone around 



clone . 



Numerous modifications and variations of the present 
invention are possible in light of the above teachings and, 
therefore, within the scope of the appended claims, the 
invention may be practiced otherwise than as particularly 
described. 
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WHAT IS CLAIMED IS : 

x An isolated polynucleotide comprising a member 

selected from the group consisting of: 

(a) a polynucleotide having at least a 70% 
identity to a polynucleotide encoding an enzyme comprising 
amino acid sequences set forth in SEQ ID NOS: 15-28; 

(b) a polynucleotide which is complementary to 

the polynucleotide of (a) ; and 

(c) a polynucleotide comprising at least 15 
bases of the polynucleotide of (a) or (b) . 

2. The polynucleotide of Claim 1 wherein the 
polynucleotide is DNA. 

3. The polynucleotide of Claim 1 wherein the 
polynucleotide is RNA. 

4 The polynucleotide of Claim 2 which encodes an 

enzyme comprising an amino acid sequence which a member 



(a) 


according to SEQ 


ID 


NO : 15 ; 


(b) 


according to SEQ 


ID 


NO:16; 


(c) 


according to SEQ 


ID 


NO:17; 


(d) 


according to SEQ 


ID 


NO:18; 


(e) 


according to SEQ 


ID 


NO:19; 


(f) 


according to SEQ 


ID 


NO:20; 


(g) 


according to SEQ 


ID 


NO:21; 


(h) 


according to SEQ 


ID 


NO:22; 


(i) 


according to SEQ 


ID 


NO:23; 


(j) 


according to SEQ 


ID 


NO: 24; 


(k) 


according to SEQ 


ID 


NO:25; 


(1) 


according to SEQ 


ID 


NO: 26; 


(m) 


according to SEQ 


ID 


NO: 27; and 


(n) 


according to SEQ 


ID 


NO:28. 
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5 An isolated polynucleotide comprising a member 

selected from the group consisting of: 

(a) a polynucleotide having at least a 70% 
identity to a polynucleotide encoding an enzyme encoded by 
the DNA contained in ATCC Deposit No. 97379, wherein said 
enzyme is selected from the group consisting of M11TL, 
OC1/4V, F1-12G, 9N2-31B/G, MSB8-6G, AEDII12RA-18B/G , GC74- 

22G and VC1-7G1; 

(b) a polynucleotide complementary to the 

polynucleotide of (a); and 

(c) a polynucleotide comprising at least 15 
bases of the polynucleotide of (a) and (b) . 

6 A vector comprising the DNA of Claim 2. 

7 A host cell comprising the vector of Claim 6. 

8 a process for producing a polypeptide comprising: 
expressing from the host cell of Claim 7 a polypeptide 
encoded by said DNA. 



9 A process for producing a cell comprising: 

transforming or transfecting the cell with the vector of 
Claim 6 such that the cell expresses the polypeptide 
encoded by the DNA contained in the vector. 

10< An enzyme comprising a member selected from the 

group consisting of: 

(a) an enzyme comprising an amino acid sequence 
which is at least 70% identical to the amino acid sequence 
set forth in SEQ ID NOS: 15-28; and 

<b) an enzyme which comprises at least 3 0 amino 
acid residues to the enzyme of (a) . 
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11. A method for generating glucose from soluble 

cellooligosaccharides comprising: 

administering an effective amount of an enyzme 
selected from the group consisting of an enzyme having the 
amino acid sequence set forth in SEQ ID NOS: 15-28, 
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Ml 1TL CLYCOSIDASE - 29G 
COMPLETE GENE SEQUENCE - 9/9 5 

; rn: aaa rrr . i r aaa cac ttc at*, at a •:»-♦ ta< n a r»r n a »-n. rrr caa -it » i .aa . *: ..i 

M.'l l. V : I*» * • l.y.'' A?:p I ' ty r V ..-i *;.. t I*?.. IMi.- t;| ( , f|„- • :).» At.. *n 

i.i im:t att !■<•»- r.t:t: T*r- cac cat n i . aat a* : I wAT rcr; rcc cta TCC crc cat cat in: cac j.-i* 

■I t;l v Mr- »»r/» Civ !U-t Clu A.*:p A>ii» S.i Asp Tip Trp V* 1 Tip ) Ids Asp I'i.i C|., i,» 

i.-i aac aca cca crT r^.A f*TA <rrr act < u :e ■ cat ttt cir cac aac ecu cca c;irr tac tcc aai iho 

M Ar.n Thi A I.i Alv) Cly Vnl S.-r Cly A..p Pin- l»ro Glu Ast» <iJy ho Gly Tyi n Ar;r* i.h 

JBJ TTA AAC CAA AAT GAC CAC CAC LTc: CCT CiAG AAC CTC CCC GTT AAC ACT ATT ACA (HA CCC 21Ct 

<»1 i.#M» Asn Gin Asn Asp His Asp l.tMi A 1 a Clu Lys Leu Cly Val Asn Tin lie Arg Val Cly 80 

24 1 CTT CAC TGC ACT ACC ATT TTT CCA AAC CCA ACT TTC AAT CTT AAA CTC CCT GTA GAG AGA 300 

Bl Val Clu Trp Ser Arg He Phe Fro Lys Pro Thr Phe Asn Val Lys Val Pro Val Clu Arg 100 

30 1 CAT GAG AAC CGC AGC ATT GTT CAC GTA CAT CTC CAT CAT AAA GCG CTT GAA AGA CTT CAT 360 

101 Asp Glu Asn Cly Ser He Val His Val Asp Val Asp Asp Lys Ala Val Clu Arg Leu Asp 120 

361 GAA TTA CCC AAC AAG GAG GCC GTA AAC CAT TAC GTA GAA ATG TAT AAA CAC TGC CTT CAA 4 20 

121 Clu Leu Ala Asn Lys Clu Ala Val Asn His Tyr Val Clu Met Tyr Lys Asp Trp Val Glu 140 

421 AGA GCT AGA AAA CTT ATA CTC AAT TTA TAC CAT TCC CCC CTC CCT CTC TGC CTT CAC AAC 4 80 

141 Arg Cly Arg Lys Leu He Leu Asn Leu Tyr His Trp Pro Leu Pro Leu Trp Leu His Asn 160 

481 CCA ATC ATG GTG AGA AGA ATG GCC CCG CAC AGA GCG CCC TCA CCC TCC CTT AAC GAG GAG S40 

161 Pro He Met Val Arg Arg Met Cly Pro Asp Arg Ala Pro Ser Gly Trp Leu Asn Glu Clu 180 

541 TCC GTG GTG CAC TTT CCC AAA TAC CCC GCA TAC ATT CCT TGC AAA ATC GGC GAG CTA CCT 600 

181 Ser Val Val Glu Phe Ala Lys Tyr Ala Ala Tyr He Ala Trp Lys Met Gly Glu Leu Pro 200 

601 CTT ATG TGC AGC ACC ATG AAC CAA CCC AAC CTC CTT TAT GAG CAA CCA TAC ATC TTC CTT 660 

201 Val Met Trp Ser Thr Met Asn Clu Pro Asn Val Val Tyr Glu Gin Cly Tyr Met Phe Val 220 

661 AAA CCG GCT TTC CCA CCC CCC TAC TTC ACT TTC GAA GCT CCT CAT AAG CCC AGG AGA AAT 720 

221 Lys Cly Cly Phe Pro Pro Gly Tyr Leu Ser Leu Glu Ala Ala Asp Lys Ala Arg Arg Asn 240 

721 ATC ATC CAG CCT CAT GCA CGC CCC TAT CAC AAT ATT AAA CGC TTC ACT AAC AAA CCT CTT 780 

-2 4 1 Met— He- CI n~ A 1 *-H i s~ A 1 a~ A r g— Al a~ Ty r — Asp- As nr~ I Te^Ly s~Ar g~ Phe~Se r~Ly s~~ Lys~ Pro"" Va 1 2 60" 

781 GCA CTA ATA TAC CCT TTC CAA TGC TTC CAA CTA TTA GAG GCT CCA CCA GAA CTA TTT GAT 840 

261 Gly Leu He Tyr Ala Phe Gin Trp Phe Glu Leu Leu Glu Gly Pro Ala Clu Val Phe Asp 280 

841 AAG TTT AAG AGC TCT AAC TTA TAC TAT TTC ACA GAC ATA GTA TCG AAC CCT ACT TCA ATC 900 

281 Lys Phe Lys Ser Ser Lys Leu Tyr Tyr Phe Thr Asp He val Ser Lys Gly Ser Ser He 300 

901 ATC AAT GTT CAA TAC AGC ACA CAT CTT CCC AAT AGG CTA CAC TCG TTC CGC CTT AAC TAC 960 

301 He Asn Val Clu Tyr Arg Arg Asp Leu Ala Asn Arg Leu Asp Trp Leu Gly Val Asn Tyr 320 

961 TAT AGC CCT TTA CTC TAC AAA ATC CTC CAT GAC AAA CCT ATA ATC CTC CAC GGC TAT CCA 1020 

321 Tyr Ser Arg Leu Val Tyr Lys He Val Asp Asp Lys Pro He He Leu His Gly Tyr Cly 340 

1021 TTC CTT TCT ACA CCT CCC CCC ATC ACC CCG GCT CAA AAT CCT TCT AGC CAT TTT CCC TCC 1080 

341 Phe Leu Cys Thr Pro Gly Cly He Ser Pro Ala Clu Asn Pro Cys Ser Asp Phe Cly Trp 360 

1081 CAG CTC TAT CCT CAA CCA CTC TAC CTA CTT CTA AAA GAA CTT TAC AAC CCA TAC CCC CTA 11 40 

3G1 Clu Val Tyr Pro Clu Gly Leu Tyr Leu Leu Leu Lys Clu Leu Tyr Asn Arg Tyr Gly Val 380 

1141 CAC TTC ATC CTC ACC GAC AAC CCT CTT TCA CAC ACC* AAA: CAT CCG TTC ACA CCC CCA TAC 1200 

381 Asp Leu lie* Val Tin Clu Asn Cly VaJ Ser Asp Set Arg Asp Ala Leu Arg Pro Ala Tyi 400 

1201 CTC CTC* TCC CAT CTT TAC* ACC CTA TCC AAA CCC CCT AAC CAC <*CC* ATT CCC CTC AAA GCC I <fb» 

40 J Vol «*t-r Ilis V»» I Tyi Ser Val Tip Lys Ala A l«* A*;n C|»# Cly I U- f> ro Vrtl Lys Cly 4''° 

I.! tti TAf" rrr rA»* to; acc rrc aca cac aat tac cac t*x. ut cac *«:c tti* a* a: cac aaa rrr i $.**• 

4;n Tyi la*t» If t r- Tip y.t-% Thr Asp A*:n Tyt Clu Trp A I.. Clu Cly l*lw- Arq C'ln ».y;i -Mi' 
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OCl/4 CLYCOSIDASE - 330/B 
COMPLETE GENE SEQUENCE - 9/95 

1 ATc; ATA AGA ACC TCC* GAT TTT I t A AAA UAT TTT ATC TTC GCA A< U i :t T At i ; lU A <;CA TAC t>0 

l MrM lie Arg Arg Ser Asp *•» i> l.y?: Ar;p l>he lie Plie Gly Thr Al.i Thr Al* A)4 Tyr 20 

M I AG ATT CAA GGT GCA CCA AAf * GAA l^AT GCC ACA CCG CCA TCA ATT TGG CAT GTC TTT TCA 120 

?\ Gin lie Glu Cly Ala Ala Asn Glu Asp Gly Arg Gly Pro Ser lie Tip Asp Va I phe Ser 40 

) 21 TAC ACC CCT CGC AAA ACC CTG AAC GGT GAC ACA GCA GAC GTT GCC TtrT GAC CAT TAT CAC 180 

4 1 Mis Thr Pro Cly Lys Thr Leu Asn Gly Asp Thr Gly Asp Val Ala Cys Asp His Tyr His 60 

IHl CCA TAC AAG CAA GAT ATC CAC CTC ATC AAA CAA ATA CCG TTA CAC CCT TAC ACC TTC TCT 24 0 

6 J Arg Tyr Lys Clu Asp lie Gin Leu Met Lys Giu He Gly Leu Asp Ala Tyr Arg Phe Ser 80 

241 ATC TCC TCG CCC ACA ATT ATC CCA CAT GGG AAG AAC ATC AAC CAA AAC CCT CTC CAT TTC 300 

81 He Ser Trp Pro Arg He Met Pro Asp Giy Lys Asn He Asn Gin Lys Cly Val Asp Phe 100 

301 TAC AAC AGA CTC GTT CAT GAG CTT TTC AAG AAT GAT ATC ATA CCA TTC GTA ACA CTC TAT 360 

101 Tyr Asn Arg Leu Val Asp Giu Leu Leu Lys Asn Asp He He Pro Phe Val Thr Leu Tyr 120 

361 CAC TCG GAC TTA CCC TAC CCA CTT TAT GAA AAA GGT GCA TCG CTT AAC CCA GAT ATA CCG 420 

121 His Trp Asp Leu Pro Tyr Ala Leu Tyr Clu Lys Cly Cly Trp Leu Asn Pro Asp He Ala 140 

421 CTC TAT TTC ACA GCA TAC CCA ACC TTT ATC TTC AAC GAA CTC CCT CAT CCT CTC AAA CAT 480 

141 Leu Tyr Phe Arg Ala Tyr Ala Thx Phe Met Phe Asn Clu Leu Gly Asp Arg Val Lys His 160 

481 TCG ATT ACA CTC AAC CAA CCA TCG TCT TCT TCT TTC TCG GGT TAT TAC ACG CCA GAG CAT S40 

161 Trp He Thr Leu Asn Giu Pro Trp Cys Ser Ser Phe Ser Gly Tyr Tyr Thr Gly Clu His 180 

S41 CCC CCG GGT CAT CAA AAT TTA CAA GAA CCG ATA ATC CCG GCC CAC AAC CTC TTC AGG GAA 600 

181 Ala Pro Gly His Gin Asn Leu Gin Giu Ala He He Ala Ala His Asn Leu Leu Arg Giu 200 

601 CAT GCA CAT CCC CTC CAC CCG TCC AGA GAA CAA GTA AAA GAT GGG CAA GTT CCC TTA ACC 660 

201 His Cly His Ala Val Gin Ala Ser Arg Giu Giu Val Lys Asp Gly Giu Val Cly Leu Thr 220 

661 AAC GTT CTC ATC AAA ATA CAA CCG GCC GAT CCA AAA CCC GAA ACT TTC TTC CTC GCA ACT 720 

221 Asn Val Val Met Lys He Clu Pro Gly Asp Ala Lys Pro Giu Ser Phe Leu Val Ala Ser 240 

721 CTT CTT GAT AAC TTC GTT AAT GCA TGG TCC CAT GAC CCT GTT GTT TTC GGA AAA TAT CCC 780 
24-1 — Leu-Val— Asp-Lys-Phe-Val-Asn~Ala-Trp~Ser~HiS"Asp-Pro-Val— Val— Phe Gly-Lys-Tyr-Pro 260- 

781 CAA GAA CCA GTT GCA CTT TAT ACG GAA AAA GGG TTC CAA CTT CTC GAT ACC GAT ATC AAT 840 

261 Giu Glu Ala Val Ala Leu Tyr Thr Clu Lys Gly Leu Gin Val Leu Asp Ser Asp Met Asn 280 

841 ATT ATT TCG ACT CCT ATA GAC TTC TTT GGT GTG AAT TAT TAC ACA AGA ACA CTT GTT GTT 900 

281 He He Ser Thr Pro He Asp Phe Phe Gly Val Asn Tyr Tyr Thr Arg Thr Leu Val Val 300 

901 TTT GAT ATC AAC AAT CCT CTT GGA TTT TCG TAT CTT CAC GGA CAC CTT CCC AAA ACG GAC 960 

301 Phe Asp Met Asn Asn Pro Leu Gly Phe Ser Tyr Val Gin Gly Asp Leu Pro Lys Thr Glu 320 

961 ATC GGA TGG CAA ATC TAC CCG CAC GGA TTA TTT CAT ATG CTC GTC TAT CTC AAG GAA AGA 1020 

321 Met Gly Trp Glu He Tyr Pro Gin Gly Leu Phe Asp Het Leu Val Tyr Leu Lys Glu Arg 340 

1021 TAT AAA CTA CCA CTT TAT ATC ACA CAC AAC CCG ATG CCT CCA CCT GAT AAA TTC GAA AAC 1080 

341 Tyr Lys Leu Pro Leu Tyr He Thr Glu Asn Cly Met Ala Cly Pro Asp Lys Leu Clu Asn 360 

10B1 CCA AGA CTT CAT CAT AAT TAC CCA ATT CAA TAT TTC CAA AAG CAC TTT GAA AAA CCA CTT 1140 

361 Gly Arg Val His Asp Asn Tyr Arg He Glu Tyr Leu Glu Lys His Phe Glu Lys Ala Leu 3 80 

1141 GAA CCA ATC AAT CCA CAT CTT CAT TTC AAA CCT TAC TTC ATT TCC TCT TTC ATC GAT AAC 1200 

181 Clu Ala He Asn Ala Asp Val Asp Leu Lys Cly Tyr Phe He Trp Ser Leu Het Asp Asn 400 

1201 TTC CAA TGG CCG TCC GCA TAC TCC AAA CCT TTC CCT ATA ATC TAC CTA GAT TAC AAT ACC 1260 

401 Phe Clu Trp Ala Cys Cly Tyr Ser Lys Arg Phe Gly He He Tyr Val Asp Tyr Asn Thr 420 

1261 CCA AAA AGG ATA TTG AAA CAT TCA GCC ATC TCG TTC AAC CAA TTT CTA AAA TCT TAA 1 3 17 

421 Pro Lys Arg lit* l^-u i.ys Asp Ser Ala Met Trp Leu Lys Ctu Phi- Leu Lys Ser End 4 «** 
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STAPRYLOTHERMOS MARINU3 G LYCOS I DASE - 12C 
COMPLETE GENE SEQUENCE 
9/95 



1 TTC ATA AGO TTT CCT CAT TAT TTC TTi; TTT CCA A« A t;r*T Af*A TCA TCC t*Ai ( AC ATI c:Af. i.O 

1 Met I»e Aro PDe Pro Asp Tyr Phe Leu Phe Oy Thr Al* Thr Set Scr His <;*n II.* t:i., ,M> 

61 CCT AAT AAC ATA TTT AAT CAT TCC TCC CAC TCC CAG ACT AAA CCC AGG ATT AAC. rm; AGA 1^0 

21 Cly Asn Asn He Phe Asn Asp Trp Trp Clu Trp Clu Thr Lys Gly Arg He Lys Vnl Arq 40 



121 



TCC CCT AAC CCA TCT AAT CAT TCC CAA CTC TAT AAA CAA CAC ATA CAC CTT ATC CCT CAG 



181 CTC CCA TAT AAT OCT TAT AGG TTC TCC ATA CAG TCC ACT AGA ATA TTT- CCC AGA AAA GAT 
61 Leu Gly Tyr Asn Ala Tyr Arg Phe Ser lie Glu Trp Ser Arg lie Phe Pro Arg Lys Asp 



241 CAT ATA 



361 CCT GGA TGG ACT AGO GAA CAC AAC ATA AAA TAT TTT ATA AAA TAT GTA GAA CTT ATA CCT 

121 Gly Cly Trp Thr Arg Clu Glu Asn lie Lys Tyr Phe He Lys Tyr Val Glu Leu He Ala 

421 TCC GAG ATA AAA CAC CTC AAA ATA TGG ATC ACT ATT AAT GAA CCA ATA ATA TAT CTT TTA 

141 Ser Clu He Lys Asp Val Lys He Trp He Thr He Asn Clu Pro lie He Tyr Val Leu 

481 CAA GGA TAT ATT TCC CCC GAA TCC CCA CCT GGA ATT AAA AAT TTA AAA ATA CCT CAT CAA 

161 Gin Gly Tyr He Ser Cly Glu Trp Pro Pro Gly He Lys Asn Leu Lys He Ala Asp Gin 

541 GTA ACT AAC AAT CTT TTA AAA OCA CAT AAT GAA CCC TAT AAT ATA CTT CAT AAA CAC CCT 

181 Val Thr Lys Asn 



601 ATT GTA CCC ATA CCT AAA AAC ATC ATA CCA TTT AAA CCA CCA TCT AAT AGA CCA AAA CAC 

201 He Val Gly He Ala Lys Asn Met He Ala Phe Lys Pro Gly Ser Asn Arg Cly Lys Asp 

661 ATT AAT ATT TAT CAT AAA CTC CAT AAA GCA TTC AAC TGG GGA TTT CTC AAC GGA ATA TTA 

221 He Asn He Tyr His Lys Val Asp Lys Ala Phe Asn Trp Cly Phe Leu Asn Gly He Leu 



721 



AGG GGA GAA CTA GAA ACT CTC CCT GCA AAA TAC CCA CTT GAG CCC CCA AAT ATT CAT TTC 



781 ATA GGC ATA AAC TAT TAT TCA TCA TAT ATT GTA AAA TAT ACT TCC AAT CCT TTT AAA CTA 

261 He Gly He Asn Tyr Tyr Ser Ser Tyr He Val Lys Tyr Thr Trp Asn Pro Phe Lys Leu 

841 CAT ATT AAA CTC GAA CCA TTA CAT ACA CCT CTA TCC ACA ACT ATG CCT TAC TCC ATA TAT 

281 His He Lys Val Glu Pro Leu Asp Thr Gly Leu Trp Thr Thr Met Gly Tyr Cys He Tyr 



TAT GAA GTT CTA ATG AAA ACT CAT GAG AAA TAC CCC AAA GAA ATA ATC 

Pro Arg Gly He Tyr Glu Val Val Met Lys Thr His Clu Lys Tyr Gly Lys Glu He He 



901 CCT AGA GGA ATA 

301 



961 ATT ACA GAG AAC CCT CTT CCA CTA CAA AAT GAT CAA TTA AGG ATT TTA TCC ATT ATC AGG 

321 He Thr Clu Asn Cly Val Ala Val Clu Asn Asp Glu Leu Arg He Leu Ser He Ha Arg 

1021 CAC TTA CAA TAC TTA TAT AAA CCC ATC AAT CAA CCA CCA AAC CTC AAA CCA TAT TTC TAC 

341 Mis Leu Gin Tyr Leu Tyr Lys Ala Met Asn Clu Cly Ala Lys Val Lys Cly Tyr Phe Tyr 

1081 TCC AGC TTC ATC GAT AAT TTT CAC TCC CAT AAA CCA TTT AAC CAA ACC TTC CCA CTA CTA 

361 Trp Ser Phe Met Asp Asn Phe Clu Trp Asp Lys Cly Phe Asn Cln Arg Phe Gly Leu Val 

1141 CAA CTT CAT TAT AAC ACT TTT CAC ACA AAA CCT AGA AAA ACC CCA TAT CTA TAT ACT CAA 

381 Clu Val Asp Tyr Lys Thr Phe Clu Arg Lys Pro Arg Lys Ser Ala Tyr Val Tyr Ser Cln 

1201 ATA GCA CCT ACC AAC ACT ATA ACT GAT GAA TAC CTA GAA AAA TAT CCA TTA AAC AAC CTC 

401 lie Ala Arg Thr Lys Thr lie Set Asp Glu Tyr Leu Glu Lys Tyr Gly Leu Lys Asn Leu 

1261 GAA TAA 1266 

4/1 Glu Extd 422 



ieo 



41 Ser Cly Lys Ala Cys Asn His Trp Glu Leu Tyr Lys Glu Asp He Glu Leu Met Ala Glu 60 



240 
80 



CAT TAT GAG TCC CTT AAT AAG TAT AAC CAA ATA CTT AAT CTA CTT ACA AAA TAC 3 00 



100 



81 His He Asp Tyr Clu Ser Leu Asn Lys Tyr Lys Clu He Val Asn Leu Leu Arg Lys Tyr 

301 GCG ATA GAA CCT CTA ATC ACT CTT CAC CAC TTC ACA AAC CCC CAA TGG TTT ATC AAA ATT 
101 Cly He Glu Pro Val He Thr Leu His His Phe Thr Asn Pro Gin Trp Phe Met Lys He 120 



360 



420 
140 

480 

160 

540 
180 

600 



Leu Leu Lys Ala His Asn Clu Ala Tyr Asn Ha Leu His Lys His Gly 200 



660 
220 

720 
240 

780 



241 Arg Cly Clu Leu Clu Thr Leu Arg Gly Lys Tyr Arg Val Clu Pro Cly Asn He Asp Phe 260 



840 
280 

900 
300 

960 
320 

1020 
340 

1080 
360 

1140 
380 

1200 
400 

1260 
420 
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Tn«r»ococn.f V«i G) y :0s xdasc OIB/O 
Cotn£ Vete ge:»< »e«>u*nce 9/95 

ATC CT* CCA GAA GGC TTT CTC TCC CCC HTC TCC CAn TCC GGC TTT CAG TTC GAG ATC CCC 60 

1 net l-mj Pro clu cly F**a Lau Trp Cly Val Sar Cln sax Gly Pha Gin Pns Glu NtC <Uy 20 

61 GAC AAG CTC AGO ACQ AAC ATT CAT CCC AAC ACA CAC TOG TCC AAG TCC CTC AGG CAT CCC 120 

21 Asp Lyu Lsu Arg Arg Asn Ila Asp rro Asn Tbr Asp Trp Trp Ly« Trp val Arg Asp pro 40 

121 TTC AAC ATA AAG AGO GAA CTC CTC ACC CJC GAC CTC CCC GAG GAG CGG ATA AAC AAC TAC 190 

41 Ph« Asn lie !.yi Arp Clu _«o val 3-r Cly Asp *«eu Pro Glu Clu Gly lie Kwn An Tyr 60 

181 GAA CTT TAC GAG AAC CAT CAC CCC CTC CCC AliA GAC CTC CCT CTC AAC CTT TAC ACQ ATT 240 

61 Glu Leu Tyr Gla Lys Asp nxm Arc Leu Ala Arg Asp Leu oly Leo Am Val Tyr Arg He 80 

241 GGA ATA GAG TOG ACC AGG ATC TTT CCC TGG CCA ACC TOG TTT GTG GAG CTT GAC CTT GAG 300 

81 Gly He Clu Trp ser Arg lie Pbe Fro Trp rro Tiir Trp Pbe val Glu Val Asp Val Glu 100 

301 CGG GAC AGC TAC CCA CTC CTC AAC CAC GfC AAA ATC CAT AAA GAC ACC CTC GAA GAG CTC 360 

101 Arg Asp S«r Tyr Cly I*«u Vnl Lys Asp Val Lys II* Asp Lys Asp Thx Lsu Clu Clu Leu 120 

361 CAC CAC ATA CCC AAT CAT CAC CAC ATA CCC TAC TAC CGC CCC CTT ATA GAC CAC CTC ACC 420 

121 Asp Clu llm Ala Asn His Cln Clu llm Aid Tyr Tyr Arg Arg Val XI* Glu His Leu Arg 140 

421 CAG CTC CCC TTC AAC CTC ATC CTC AAC CTC AAC CAC TTC ACC CTC CCC CTC TOC CTT CAC 480 

141 Clu L*u Cly Pbe Lx» Val lis Val Asn !*e«i Asn His Pb« Thr Lau Pro Uu Trp L«u His 160 

481 CAT CCC ATA ATC CCC ACC CAC AAC CCC CTC ACC AAC CCT ACC ATT CCC TCC CTC CCC CAC 540 

161 Asp Pro Ila lie Ala Arg Clu Lys Ala Uiu Thr Asn Cly Arg 11* Cly Trp Val Cly Cln 180 

541 CAC ACC CTC CTC CAC TTC CCC AAG TAC CCC GCC TAC ATC CCC AAC CCA CTC COS GAC CTC 600 

181 Clu Ser Val Val Clu Pho Ala Lys Tyr Aid AlA Tyr Ha Ala Asn aIa Leu Gly Asp Leu 300 

501 CTT CAT ATG TGG ACC ACC TTC AAC CAG CCC ATC CTC CTT CTC CAC CTC OCT TAC CTC CCC 660 

201 Val ASP Xec Trp Ser Thx Pna Asn Slu Pro Bat Val val Val Glu Leu Gly Tyz Leu Ala 220 

661 CCC TAC TCC GCC TTT CCC CCG CGG CTT ATG AAC CCC GAC GCG CCA AAG CTG GGA ATC CTC 720 

221 Pro Tyr Ser Gly Phe Pro Pro Gly val aet Are Pro alu Ala Ala cya Leu Ala lis Leu 240 

721 AAC ATC ATA AAC CCC CAC CCA CTC CCC TAC AAG ATO ATA AAG AAG TTC GAC AGO OTA AAG 780 

2 41 Asn Met Ila Asa Ala His Ala Leu Ala T yx Lys wet Ha t»ye ry Pna Asp Axg val Lys 260 

781 GCC CAT AAG GAT TCC CGC TCC GAG GCC GAC GTC GOG ATA ATC TAC AAC AAC ATA OGC CTT 840 

261 Ala Asp Ly* Asp Car Arg Sar Glu Ala Clu Val Gly 11 a Ila Tyr Asn Asa Ila Cly Val 280 

841 GCC TAT CCA TAC SAC TCC AAC GAC CCA AAG GAC GTG AAA CCT GCA GAA AAC GAC AAC TAC 900 
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THERMOCOCCUS CHITONOPHACUS CI/TCOSXDAflE - 2 20 
COMPLETE SEQUENCE - 9/9 5 

I TTCCTTCCA CAC AAC TTT CTC TCC CCA CTT TCA CAC TCC CCA TTC CAC TTT CAA ATC CCC 

1 Met Leu Pro Clu Asn Phe Leu Trp Cly Va 1 Ser Cln Ser Cly Phe Cln Phe Glu Met Cly 

61 CAC AGA CTC AGG ACC CAC ATT CAT CCA AAC ACA GAT TCC TGG TAC TCG CTA AGA CAT CAA 

21 Asp Arg Uu Arg Arg His He Asp Pro Asn Thr Asp Trp Trp Tyr Trp Val Arg Asp Clu 

121 TAT AAT ATC AAA AAA CCA CTA CTA ACT GOG GAT CTT CCC CAA CAC CCT ATA AAT TCA TAT 

41 Tyr Asn He Lys Lys Gly Leu Val Ser Cly Asp Leu Pro Clu Asp Cly He Asn Ser Tyr 

1B1 CAA TTA TAT GAG ACA CAC CAA CAA ATT CCA AAG CAT TTA CCC CTC AAC ACA TAT ACC ATC 

61 Clu Leu Tyr Clu Arg Asp Cln Clu He Ala Lys Asp Leu Cly Leu Asn Thr Tyr Arg He 

241 CCA ATT CAA TCC ACC ACA CTA TTT CCA TGG CCA ACC ACT TTT CTC CAC CTC CAC TAT CAA 

Bl Cly He Clu Trp Ser Arg Val Phe Pro Trp Pro Thr Thr Phe Val Asp Val Clu Tyr Glu 

301 ATT CAT CAC TCT TAC GGG TTC CTA AAG CAT CTG AAG ATT TCT AAA GAC CCA TTA CAA AAA 

101 He Asp Clu Ser Tyr Gly Leu Val Lys Asp Val Lys He Ser Lys Asp Ala Leu Clu Lys 

361 CTT GAT CAA ATC CCT AAC CAA AGG CAA ATA ATA TAT TAT AGO AAC CTA ATA AAT TCC CTA 

121 Leu Asp Clu He Ala Asn Cln Arg Clu He He Tyr Tyr Arg Asn Leu He Asn Ser Leu 

421 ACA AAG AGG CCT TTT AAC CTA ATA CTA AAC CTA AAT CAT TTT ACC CTC CCA ATA TCC CTT 

141 Arg Lys Arg Cly Phe Lys Val He Leu Asn Leu Asn His Phe Thr Leu Pro lie Trp Leu 

481 CAT GAT CCT ATC GAA TCT AGA CAA AAA CCC CTG ACC AAT AAG AGA AAC GGA TCG GTA ACC 

161 His Asp Pro He Clu Ser Arg Clu Lys Ala Leu Thr Asn Lys Arg Asn Cly Trp Val Ser 

541 GAA AGG ACT GTT ATA GAC TTT CCA AAA TTT GCC CCC TAT TTA CCA TAT AAA TTC GGA GAC 

181 Glu Arg Ser Val He Glu Phe Al* 1*8 Phe Ala Ala Tyr Leu Ala Tyr Lys Phe Cly Asp 

601 ATA GTA CAC ATG TGG AGC ACA TTT AAT GAA CCT ATC CTC CTC GCC GAG TTC GGG TAT TTA 

201 He Val Asp Met Trp Ser Thr Phe Asn Clu Pro Met Val Val Ala Clu Leu Cly Tyr Leu 

661 CCC CCA TAC TCA CCA TTC CCC CCC CCA CTC ATG AAT CCA CAA CCA CCA AAC TTA CTT ATC 

221 Ala Pro Tyr Ser Gly Phe Pro Pro Gly Val Met Asn Pro Clu Ala Ala Lys Leu Val Met 

721 CTA CAT ATC ATA AAC GCC CAT CCT TTA CCA TAT ACC ATO ATA AAC AAA TTT GAC ACA AAA 

241 Leu His Met He Asn Ala His Ala Leu Ala Tyr Arg Met He Lys Lys Phe Asp Arg Lys 

781 AAA CCT GAT CCA GAA TCA AAA GAA CCA CCT GAA ATA CCA ATT ATA TAC AAT AAC ATC CGC 

261 Lys Ala Asp Pro Glu Ser Lys Glu Pro Ala Clu He Gly He He Tyr Asn Asn He Gly 

841 GTC ACA TAT CCG TTT AAT CCC AAA GAC TCA AAG GAT CTA CAA GCA TCC CAT AAT CCC AAT 

281 val Thr Tyr Pro Phe Asn Pro Lys Asp Ser Lys Asp Leu Cln Ala Ser Asp Asn Ala Asn 



60 
20 



901 TTC TTC CAC ACT GGG CTA TTC TTA ACC CCT ATC CAC AGG CCA AAA TTA AAT ATC GAA 

301 Phe Phe His Ser Cly Leu Phe Leu Thr Ala He His Ar« Gly Lys Leu Asn He Clu Phe 

961 GAC GGA GAG ACA TTT CTT TAC CTT CCA TAT TTA AAG CGC AAT GAT TGG CTG GGA CTG AAT 

321 Asp Gly Clu Thr Phe Val Tyr Leu Pro Tyr Leu Lys Cly Asn Asp Trp Leu Cly Val Asn 

1021 TAT TAT ACA AGA CAA GTC CTT AAA TAC CAA CAT CCC ATC TTT CCA ACT ATC CCT CTC ATA 

341 Tyr Tyr Thr Arg Clu Val Val Lys Tyr Cln Asp Pro Met Phe Pro Ser He Pro Leu He 

10B1 AGC TTC AAC GCC CTT CCA CAT TAT GGA TAC CCA TCT ACA CCA CCA ACC ACC TCA AAC GAC 

361 Ser Phe Lys Gly Val Pro Asp Tyr Cly Tyr Cly Cys Arg Pro Cly Thr Thr Ser Lys Asp 

1141 GGT AAT CCT GTT ACT GAC ATT GGA TCG GAC CTA TAT CCC AAA CGC ATC TAC CAC TCT ATA 

381 Cly Asn Pro Val Ser Asp He Cly Trp Clu Val Tyr Pro Lys Cly Met Tyr Asp Ser He 

1201 GTA CCT CCC AAT GAA TAT GGA CTT CCT CTA TAC GTA ACA GAA AAC CCA ATA GCA CAT TCA 

401 Val Ala Ala Asn Clu Tyr Cly Val Pro Val Tyr Val Thr Clu Asn Cly He Ala Asp Ser 

1261 AAA CAT CTA TTA AGG CCC TAT TAC ATC CCA TCT CAC ATT GAA GCC ATC GAA GAC CCT TAC 

421 Lys Asp Val Leu Arg Pro Tyr Tyr He Ala Ser Mis He Clu Ala Met Clu Clu Ala Tyr 
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1 \2l CAA AAT CCT TAT GAC CTG 

44 1 Glu Asn Gly Tyr Asp Val 

1381 CCC TTA GCC TTC AGA ATG 

461 Ala Leu Cly Phe Arg Met 

1441 AAA CCC AGG AAA AAG ACT 

481 Lys Pro Arg Lys Lys Ser 

1501 AGC AAC ATC AGG AAA GAG 

501 Ser Asn He Arg Lys Glu 



AGA OCA TAC TTA CAC TCG GCA TTA 
Arg Cly Tyr Leu His Trp Ala teu 

AGG TTT GGC TTC TAC CAA CTA AAC 
Arg Phe Cly Leu Tyr Clu Val Asn 

GTA AGA CTA TTC AGA GAC ATA CTT 
Val Arg Val Phe Arg Clu He Val 

ATC TTA CAC CAC GCC TAG 1536 
He Leu Clu Clu Gly End 512 



ACC CAT AAT TAC CAA TCG I tMO 

Thr Asp Asn Tyr Clu Trp 460 

TTC ATA ACC AAA GAC AGA 14 40 

Leu He Thr Lys Glu Arg 480 

ATT AAT AAT GCG CTA AC A 1500 

He Asn Asn Gly Leu Thr 500 
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pyococcus rouiomm glxcosxdjuw - 701 

CGMBIJKTX GKKB SZQUEHCX - 10/95 

1 ATC TTC CCT GAA AAG TTC CTT TGG GOT CTG GCA CAA TCG GGT TTT CAG TTT GAA ATC. GGG £0 

1 Met Phe Pro Glu Lys Phe Leu Txp GXy Val Ala Gin 5cr Gly Phe Gin Phe Glu Met Giy 20 

61 GAT AAA CTC AGG AGG AAT ATT GAC ACT AAC ACT GAT TGG TGG CAC TGG CTA ACG GAT AAG 120 

21 Asp Lys Leu Arg Arg Asn lie Asp Thr Asn Thr Asp Trp Trp His Trp Val Arg Asp Lys 40 

121 ACA AAT ATA GAG AAA GGC CTC GTT AGT GGA GAT CTT CCC GAG GAG GGG ATT AAC AAT TAC ISO 

41 Thr Asn He Glu Lys Gly Leu Val 3er Giy Asp Leu Pro Clu Glu Gly lie Asn Asn Tyr 60 

181 GAG CTT TAT GAG AAG GAC CAT GAG ATT GCA AGA AAG CTC GGT CTT AAT GCT TAC AGA ATA 240 

61 Glu Leu Tyr Glu Lys Asp His Glu lie Ale Arg Lys Leu Gly Leu Asn Ala Tyr Arg 11c 80 

241 GGC ATA GAG TGG AGC AGA ATA TTC CCA TGG CCA ACG ACA TTT ATT GAT CTT GAT TAT AGC 300 

81 Giy He Clu Trp Ser Arg He Phe Pro Trp Pro Thr Thr Phe lie Asp Val Asp Tyx Ser 100 

301 TAT AAT GAA TCA TAT AAC CTT ATA GAA GAT GTA AAG ATC ACC AAG GAC ACT TTG GAG GAG 160 

101 Tyr Asn Glu Ser Tyr Asn Leu He Glu Asp Val Lys He Thr Lys Asp Thr Leu Glu Glu 120 

3 61 TTA GAT GAG ATC CCC AAC AAG ACG GAG CTC CCC TAC TAT AGG TCA GTC ATA AAC AGC CTG 420 

121 Leu Asp Glu He Ala Asn Lys Arg Glu Val Ala Tyr Tyr Arg Ser Val He Asn Ser Leu 140 

421 AGG ACC AAG GGC TTT AAG GTT ATA GTT AAT CTA AAT CAC TTC ACC CTT CCA TAT TGG TTG 490 

141 Arg Ser Lys Gly Phe Lys Val He Val Asn Leu Asn His Phe Thr Leu Pro Tyr Trp leu 160 

481 CAT GAT CCC ATT GAC GCT AGG GAG AGC CCC TTA ACT AAT AAG ACG AAC GCC TGG CTT AAC 540 

161 His Asp Pro He Glu Ala Arg Glu Arg Ala Leu Thr Asn Lys Arg Asn Gly Trp Val Asn L80 

541 CCA AGA ACA CTT ATA GAG TTT GCA AAG TAT GCC GCT TAC ATA GCC TAT AAG TTT CGA GAT 600 

1B1 Pro Arg Thr Val He Glu Phe Ala Lys Tyx Ala Ala Tyr He Ala Tyr Lys Phe Gly Asp ZOO 

601 ATA GTG GAT ATG TCG AGC ACG TTT AAT GAG CCT ATG GTG CTT GTT GAG CTT GGC TAC CTA 660 

201 He Val Asp Met Trp Ser Thr Phe Asn Glu Pro Met Val Val Val Glu Leu Gly Tyx leu 220 

661 GCC CCC TAC TCT GGC TTC CCT CCA GGG GTT CTA AAT CCA GAG GCC GCA AAG CTG GCG ATA 720 

221 Ala Pro Tyr Ser Giy Phe Pro Pro Gly Val Leu Asn Pro Glu Ala Ala Lys Leu Ala He 240 

721 CTT CAC ATG ATA AAT GCA CAT CCT TTA GCT TAX AGC CAG ATA AAG AAG TTT GAC ACT GAG 780 

241 Leu Hxs Met He Asn Ala His Ala Leu Ala Tyr Arg Gin He Lys Lys Phe Asp Thr Glu 260 

781 AAA GCT GAT AAG GAT TCT AAA GAG CCT GCA GAA GTT GGT ATA ATT TAC AAC AAC ATT GGA 840 

261 Lys Ala Asp Lys Asp Ser Lys Glu Pxo Ala Glu Val Gly He He Tyr Asn Asn He Gly 280 

841 GTT GCT TAT CCC AAG GAT CCG AAC GAT TCC AAG GAT CTT AAG GCA GCA GAA AAC GAC AAC 900 

261 Val Ala Tyr Pro Lys Aso Pro Asn Asp Ser Lys Asp Val Lys Ala Ala Glu Asn Asp Asn 300 

901 TTC TTC CAC TCA GCG CTG TTC TTC GAG GCC ATA CAC AAA GGA AAA CTT AAT ATA GAG TTT 960 

301 Phe Phe His Ser Gly Leu Phe Phe Glu Ala lie Hxs Lys Gly Lys Leu Asn He Glu Phe 320 

961 GAC GCT GAA ACG TTT ATA GAT CCC CCC TAT CTA AAG GGC AAT GAC TGG ATA GGG CTT AAT 1020 

321 Asp Gly Glu Thr Phe He Asp Ala Pro Tyr Ltu Lys Gly Asn Asp Trp He Gly Val Asn 340 

1021 TAC TAC ACA AGG GAA GTA CTT ACG TAT CAG GAA CCA ATG TTT CCT TCA ATC CCG CTG ATC 1080 

341 Tyr Tyr Thr Arg Clu Val Val Thr Tyr Gin Glu Pxo Met Phe Pro Sex He Pro Leu He 360 

1081 ACC TTT AAG CGA GTT CAA GGA TAT CCC TAT GCC TCC AGA CCT GGA ACT CTG TCA AAC GAT 1140 

361 Thr Phe Lys Gly Val Gin Gly Tyr Gly Tyr Ala Cys Arg Pro Gly Thr Leu Sec Lys Asp 38 0 

1141 GAC AGA CCC GTC AGC GAC ATA CGA TCG GAA CTC TAT CCA GAG GGG ATC TAC GAT TCA ATA 1200 

381 Asp Arg Pro Val Ser Asp lie Gly Trp Glu Leu Tyr Pro Glu Gly Met Tyr Asp Ser He 400 

1201 GTT GAA GCT CAC AAG TAC GGC GTT CCA CTT TAC GTG ACG GAG AAC GGA ATA GCG GAT TCA 1260 

401 Val Glu Ala Hxs Lys Tyr Gly Val Pro Val Tyr Val Thr Clu Asn Gly He Ala Asp Ser 420 
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isnkia gouldl endoglncaaase (370P1) 



9 10 27 36 45 54 

5* ATG ACA ATA CGT TTA GCG ACQ CTC GCQ CTC TGC CCA GCG CTG AGC CCA CTC ACC 
Met Ary He Axg Leu Ala Thr Leu Ala Leu Cys Ala Ala Leu Ser Pro Val Thr 

S3 72 81 90 99 103 

TTT CCA CAT AAT GTA ACC CTA CAA ATC GAC GCC GAC GCC GGT AAA AAA CTC ATC 
Fh» Ala Asp Asn Val Thr Val Ola He Asp Ala Asp Cly Cly Lye Lye Lou He 

117 126 135 144 153 162 

ACC CCA GCC CTT TAC COC ATG AAT AAC TCC AAC CCA CAA ACC CTT ACC GAT ACT 
Ser Arg Ala Leu Tyr Gly Met Asn Asn Ser Asn Ala Glu Ser Leu Thr Aep Thr 

171 180 189 19B 207 216 

GAC TOO CAO CGT TTT CGC GAT GCA GGT GTG CGC ATG CTG CGG GAA AAT GGC GCC 
A0P Trp Gin Arg Pbe Arg Asp Ala Gly Val Arg Hat Leu Arg Glu Aas Gly Gly 

225 234 243 252 261 270 

AAC AAC AGC ACC AAA TAT AAC TOG CAA CTG CAC CTG AGC AGT CAT COG GAT TOG 
Aan Asn Ser Thr Ly* Tyr Asn Trp Gin Leu His Leu Ser Ser His Pro Asp Trp 

279 280 297 306 315 324 

TAC AAC AAT CTC TAC GCC GGC AAC AAC AAC TGG GAC AAC CGG GTA GCC CTG ATT 
Tyr Aan Asn Val Tyr Ala Gly Aan Aan Aan Trp Asp Asn Arg Val Ala Leu He 

333 342 351 360 369 378 

CAG GAA AAC CTG CCC GGC GCC GAC ACC ATC TGG GCA TTC CAG CTC ATC OCT AAG 
Gin Glu Asn Leu Pro Gly Ala Asp Thr Met Trp Ala Pha Gin Leu He Gly Lys 

307 396 405 414 423 432 

OTC GCQ GCG ACT TCT GCC TAC AAC TTT AAC GAT TGG GAA TTC AAC CAO TCG CAA 
Val Ala Ala Thr Ser Ala Tyr Asn Pha Asn Asp Trp Glu Pho Asn Gin Ser Cln 

441 450 459 468 477 486 

TGG TOG ACC GGC GTC GCT CAG AAT CTC GCT GGC GGC GOT GAA CCC AAT CTG GAC 

Trp Trp Thr Gly Val Ala Gin Asn Leu Ala Gly Gly Cly Glu Pro Asn Leu Asp 

495 504 513 522 531 540 

GGC GGC GGC GAA GCG CTG CTT GAA GGA GAC CCC AAT CTC TAC CTC ATG GAT TGG 
Gly Gly Gly Glu Ala Leu Val Glu Cly Asp Pro Asn Leu Tyr Leu Met Asp Trp 

549 558 567 576 585 S94 

TCG CCA GCC GAC ACT GTG GGT ATT CTC GAC CAC TGG TTT GGC GTA AAC QGG CTC 
Ser Pro Ala Asp Thr Val Gly He Leu Asp His Trp Phe Gly Val Asn Gly Lau 

603 612 . 621 630 639 640 

GGC GTG CGG CGT GGC AAA GCC AAA TAC TGG AGT ATG GAT AAC GAG CCC GGC ATC 
Gly Val Arg Arg Gly Lys Ala Lys Tyr Trp Ser Met, Asp Asn Glu Pro Gly He 

657 666 675 684 693 702 

TGG CTT GGC ACC CAC GAC GAT GTA GTG AAA GAA CAA ACC CCG GTA GAA GAT TTC 
Trp Val Gly Thr His Asp Asp Val Val lys Glu Gin Thr Pro Val Glu Asp Phe 

Figure 9 
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Benkla gouldi tAdooXuciBAat C37GP1) (continued) 

711 720 729 738 747 756 

CTC CAC ACC TAT TTC GAA ACC GCC AAA AAA GCC CGC GCC AAA TTT CCC GOT ATT 
Leu His Thr Tyx Phe Glu Thr Ala Lye Lye Ala Arg Ala Lye Phe Pro Gly li* 

765 774 783 792 801 810 

AAA ATC ACC CGT CCC CTG CCC CCT AAT GAG TGG CAG TGG TAT GCC TGG GGC GGT 
Lya He Thr Oly Pro Val Pro Ala Aan Glu Trp Gin Trp Tyr Ala Trp Gly Gly 

819 B2B 837 846 855 864 

TTC TOG GTA CCC CAG GAA CAA GGG TTT ATG AGC TGG ATG GAG TAT TTC ATC AAG 
Phe Ser Val Pro Gin Glu Gin Gly Phe Mac Ser Trp Hat Glu Tyr phe xie I/ys 

873 882 891 900 909 918 

CGG GTG TCT GAA GAG CAA CGC GCA ACT GGT GTT COC CTC CTC GAT GTA CTC GAT 
Arg Val Scr Glu Glu Gin Aro Ala Ser Gly Val Arg Leu Leu Aap Val Leu Aop 

927 936 945 954 963 972 

CTG CAC TAC TAC CCC GGC GCT TAG AAT GCG GAA GAT ATC GTG CAA TTA CAT CGC 
Leu His Tyr Tyr Pro Gly Ala Tyr Aan Ala Glu Asp lie Val Gin Leu Hia Arg 

981 990 999 1008 1017 1026 

ACG TTC TTC GAC CGC GAC TTT GTT TCA CTG GAT CCC AAC GGG GTG AAA ATG GTA 
Thr Phe Phe Aap Arg Aap Phe Val Ser Leu Aap Ala Asn Gly Val Lys Met Val 

1035 1044 1053 1062 1071 1080 

GAA GGT GGC TOO GAT GAC AGC ATC AAC AAG GAA TAT ATT TTC GGC CGA GTG AAC 
Glu Gly Gly Trp Asp Asp Ser lie Aan Lye Glu Tyr lie Phe Gly Ary Val Aan 

1089 1098 1107 1116 1125 1134 

GAT TGG CTC GAG GAA TAT AT G GGG CCA G AC CAT GGT GTA ACC CTG GGC TTA ACC 
Aap Trp Leu Glu Glu Tyr Ket Gly Fro Aap Hie Gly Vol Thr Leu Gly Leu Thr 

1143 1152 1161 1170 1179 1188 

GAA ATG TGC GTG CGC AAT GTG AAT CCG ATG ACT ACC GCC ATC TGG TAT GCC TCC 
Glu Met Cya Val Arg Aan Val Aan Pro Met Thr Thr Ala lie Trp Tyr Ala Ser 

1197 1206 1215 1224 1233 1242 

ATG CTC GGC ACC TTC GCC GAT AAC CGC CTC GAA ATA TTC ACC CCA TGG TGC TGG 
Met Leu Gly Thr Phe Ala Aep Aan Gly Val Glu lie Phe Thr Pro Trp Cya Trp 

1251 1260 1269 1278 1287 1296 

AAC ACC GGA ATG TGG GAA ACA CTC CAC CTC TTC AGC CCC TAC AAC AAA CCT TAT 
Aan Thr Gly Met Trp Clu Thr Leu His Leu Phe Ser Aro Tyr Asn Lya Pro Tyr 

1305 1314 1323 1332 1341 1350 

CGG CTC GCC TCC AGC TCC ACT CTT GAA GAG TTT CTC AGC CCC TAC AGC TCC ATT 
Arg Val Ala Ser Ser Ser Ser Leu Glu Glu Phe Val Ser Ala Tyr Ser Ser He 

1359 1368 1377 1386 1395 1404 

AAC GAA GCA GAA GAC GCC ATG ACG GTA CTT CTG GTG AAT CGT TCC ACT AGC GAG 
Asn Glu Ala Glu Aap Ala Mac Thr Val Leu Leu Val Asn Arg Ser Thr ser Glu 

Figure 9 (Continued) 



BNSDOCID: <WO 972541 7A1J_> 



WO 97/25417 



16/33 



PCT/US97/00092 < 



»a»kla oWLAl .nAooluoana. • (37l»l) <c«at±n»e4l 

1413 1*22 1*31 1449 1458 

ACC CAC ACC CCC ACT GTC GCT ATC GAC GAT TTC CCA CTG GAT GGC CCC TAC CGC 
S Thr Ala Thr Val Ala He Asp Aap Ph. Pro Lou A«p Gly Pro Tyr Arg 

146 7 1*76 1485 1«94 1503 1512 

&rv» r-rc CGC TTA CAC AAC CTG CCG GGC GAG GAA ACC TTC CPA TCT CAC CGA GAC 
Tbr I*u Arg Leu Hid Aan Leu Pro Gly Glu Glu Thr Phe val Ser Hi a Arg Asp 

1S21 1530 1539 15*8 15S7 1566 

/LjLn GCC CTO GAA AAA GGT ACA GTG CGC GCC AGC GAC AAT ACG CTA ACA CTC GAC 
Ac* Ala L-u Glu Lya Gly Tbr Val Arg Ala Ser Asp A« Thr Val Thr Leu Glu 

1575 1584 1593 1602 1611 

OCC CCT CTG TCC GIT ACT GCA ATA TTG CTC AAG GCC CGO CCC TAA 3 • 
Leu Pro Pro Leu Ser Val Tbr Ala Ha Lau Leu Lya Ala Arg Pro 
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Utr* o*>t oc/a ffwtri L inva Alpha -oa us idauDO 
Cmnplvte Con© Secju*wr« Q ^- ^. ^ 

9 16 27 M> 4i> 54 

5 ' GTC ATC TGT GTG OAA ATA 1TC GGA AAC ACC TTC AOA OAT, OCA AGA TTC GTT CTC 

Val lie Cys Val Clu Tie Phe Cly Uys *mr rtie Arg Glu Cly Arg Phe Vol Leu 

63 72 81 90 99 108 

AAA GAG AAA AAC TTC ACA GTT GAG TTC GCG GTG GAG AAG ATA CAC CTT GGC TOG 

Lys Glu Lys Asn Phe Thr val Glu Phe Ala Val Clu Lys II* Hi* Leu Gly Trp 

117 126 135 144 153 162 

AAG ATC TCC GGC AGG GTG AAG GGA ACT CCG GGA AGG CTT GAG OTT CTT GGA ACG 

Lys lie Ser Gly Aro; Val Lys Gly Ser Pro Gly Arg Leu Glu Val Leu Aro; Thr 

171 180 189 198 207 216 

AAA GCA CCG GAA AAG CTA CTT GTG AAC AAC TOG CAG TCC TOG GGA CCG TOC AGG 

Lys Ala Pro Glu Lys Vol Leu Val Asn Asn Trp Gin Ser Trp Gly Pro Cys Arg 

225 234 243 252 261 270 

GTG GTC GAT GCC TIT TCT TTC AAA CCA CCT GAA ATA GAT CCG AAC TOG AGA TAC 

Val Val Asp Ala Phe S«r Phe Lys Pro Pro Clu lie Asp Pro Asn Trp Arg Tyr 

279 288 297 306 3X5 324 

ACC GCT TOG GTC GTG CCC GAT CTA CTT GAA AGG AAC CTC CM AGC GAC TAT TTC 

Thr Ala Ser Val Val Pro Asp Val Leu Glu Arg Asn Leu Gin Ser Asp Tyr Phe 

333 342 351 360 369 378- 

CTC GCT GAA GAA GGA AAA GTG TAC GGT TTT CTG ACT TCC AAA ATC GCA CAT CCT 

Val Ala Glu Glu Gly Lys Val Tyr Cly Phe Leu Ser Ser Lya lie Ala His Pro 

387 396 405 414 423 432 

TTC TTC GCT GTG GAA GAT GGG GAA CTT GTG GCA TAC CTC GAA TAT TIC GAT GTC 

Phe Phe Ala Val Clu Asp Gly Clu Leu Val Ala Tyr Leu Clu Tyr Phe Asp Val 



441 450 459 468 477 486 

GAG TTC GAC GAC TTT CTT CCT CTT GAA CCT CTC GTT OTA CTC GAG GAT CCC AAC 



Glu 


Phe Asp Asp Phe Val 
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if lervotcK/a jnnxu C. una Alpha -oalaclosid&oc 

603 612 621 630 639 $48 

GAT CTC ACC TOG GAA GAG ACC CTC AAG AAC CTC AAG CTC OCO AAC AAT TTC CCG 



Asp Leu Thr Trp Glu Glu Thr Leu Lys Asn l>eu Lys L«u Ala Lys Aon Phe Pro 

657 666 675 684 693 702 

TTC GAG CTC TTC GAG ATA GAC GAC GCC TAC GAA AAG CAC ATA GGT GAC TOG CTC 

Phe Glu Val Phe Gin lie Asp Asp Ala Tyx Glu Lys Asp lie Gly Asp Trp Leu 

711 720 729 738 747 756 

GTO ACA AGA GGA GAC TTT CCA TCG GTG GAA GAG ATO GCA AAA OTP ATA OCG GAA 

Val Thr Arg Gly Asp Phe Pro Ser Val Glu Glu Met Ala Lyfl Val lie Ala Glu 

765 774 783 792 801 810 

AAC GGT TTC ATC CCD GGC ATA TOG ACC GCC CCG TIC ACT GTT TCT GAA ACC TCG 

Asn Gly Phe lie Pro Gly lie Trp Thr Ala Pro Phe Ser Val Sex Glu Thr Sex 

819 828 837 846 855 864 

GAT OIK TTC AAC GAA CAT CCG GAC TOG GTA CTC AAG GAA AAC GGA GAG COG AAG 



Asp Val Phe Asn Glu His Pro Asp Trp Val Val Lys Glu Asn Gly Glu Pro Lys 

873 882 891 900 909 918 

ATC GCT TAC AGA AAC TOG AAC AAA AAG ATA T*C GCC CTC GAT CTT TOG AAA GAT 

Met Ala Tyr Arg Asn Trp Asn Lys Lys lie Tyr Ala Leu Asp Leu Sex Lye Acp 

927 936 945 954 963 972 

GAG GTT CTC AAC TCG CTT TIC GAT CTC TIC TCA TCT CIG AGA AAG ATC GGC TAC 

Glu Val Leu Asn Trp Leu Phe Axp Leu Phe Ser Ser Leu Arg Lys Met Gly Tyr 

981 990 999 1008 1017 1026 

AGG TAC TTC AAG ATC GAC TTT CTC TTC CCG GGT GCC CTT CCA GGA GAA AGA AAA 

Arg Tyr Phe Lys lie Asp Phe Leu Phe Ala Gly Ala Val Pro Gly Glu Arg Lys 

1035 1044 1053 1062 1071 1080 

AAG AAC ATA ACA CCA ATT CAG GCC TTC AGA AAA GGG ATT GAG ACG ATC AGA AAA 

Lye Asn He Thr Pro He Gin Ala Phe Arg Lys Gly He Glu Thr He Arg Lye 

t 1089 1098 1107 1116 1125 1134 

GCC GTG GGA GAA GAT TCT TTC ATC CTC GCA TOC CCC TCT CCC CTT CTT CCC GCA 

Ala Val Gly Glu Asp Ser Phe He Leu Gly Cys Gly St*i Pro Leu Leu Pro Ala 

1143 11!>2 1161 H70 1179 1188 

CTC GGA TOC CTC GAC GOG ATC AGG ATA GGA OCT GAC ACT CCG CCG TTC TOG GGA 

Val Gly Cys val Asp Gly Met Arg He Gly Pro Asp Hir Ala Pro Phe Trp Oly 

F igure 10 ( Continued ) 
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TJi*-r o*>t cx/n man Lima Alpha -oal net oi;i<iauc» 
( tjnv»l«Ue Gcsne Sexjurw:** ^ 

1*97 1206 121S 1224 1233 1242 

GAA CAT ATA GAA GAC AAC GCA OCT CCC CCT OCA ACA TOG GCG CTC AGA AAC GCC 

Glu His lie Clu Asp Asn Cly Ala Pro Ala Ala Ary Trp Ala Ixt* Arg Asn Ala 

12S1 1260 1269 1278 1287 1296 

ATA ACG AOS TAC TTC ATC CAC GAC AGC TTC TOG CTC AAC GAC CCC GAC TOT CTC 

lie Ttir Arg Tyr Phe M«t Hxe Asp Arg Phe Trp l^ru Asn Asp Pro Acp Cy» Leu 

1305 1314 1323 1332 1341 1350 

ATA CTG AGA GAG GAG AAA ACG GAT CTC ACA CAG AAG GAA AAG GAG CIC TAC TOG 

lie Leu Arg Glu Glu Lys Thr Asp Leu Thr Gin Lye Glu Lya Glu Leu Tyr Sex 

1359 1368 1377 1386 1395 1404 

TAC ACG TGT GGA GIG CTC GAC AAC ATC ATC ATA GAA AGC GAT GAT CIC TOG CTC 



Tyr Ttn: Cys Cly Val Leu Asp A-in Met Xle lie Glu Ser Asp Asp Lieu Sex Leu 

1413 1422 1431 1440 1449 1456 

GTC AGA GAT CAT GGA AAA AAG GTT CIC AAA GAA ACG CIC GAA CIC CIC GGT GGA 

Val Arg Asp His Gly Ly* Lys Val Leu Lys Glu Ax Leu Glu Leu Leu Gly Cly 

1467 1476 1485 1494 1503 1512 

AGA CCA COG GIT CAA AAC ATC ATC TOG GAG GAT CTG AGA TAC GAG ATC GTC TOG 



Arg Pro Arg Val Gin Asn lie Met Ser Glu Asp Leu Arg Tyr Glu Xle Val Sex 

1521 1530 1539 1548 1557 1566 

TCT GGC ACT CTC TCA OCA AAC GTC AAG ATC GTC CTC GAT CTG AAC AGC AGA GAG 

Ser Gly Trxr Leu Ser Gly Asn Val Lys lie Val Val Aop Leu Asn Sex Arg Glu 

1S75 1584 1593 1602 1611 1620 

TAC CAC CTG GAA AAA GAA GGA AAG TCC TCC CTC AAA AAA AGA OTC GTC AAA AGA 

Tyr Hie Leu Glu Lys Glu Gly Lys Ser Ser Leu Lys Lys Arg Val Val Lys Arg 

1629 1638 1647 1656 1665 

GAA GAC GGA AGA AAC TTC TAC TTC TAC GAA GAC OCT GAG AGA GAA TGA 3 * 

Glu Asp Gly Arg Asn Phe Tyr Phe Tyr Glu Glu Gly Glu Arg Glu 



Figure 10 (Continued) 
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Th«x*otov» maritime &-»enaa»«i 
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81 

GTT CTC TTT 


90 
CCA 
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GAC 
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GAC 
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108 
AAA 


Leu 
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He Val 


Glu 
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Ser Phe 


Val Leu Phe 


Ala 


Ser Asp Glu 


Phe 


Val 


i*y» 


GTG 


GAA 


AAC GGA 


AAA 


136 

TTC 


GCT CTQ 


135 

AAC GGA AAA 


144 

GAA 


TTC 


AGA 


153 
TTC 


ATT 


GGA 


162 
AGC 


Val 


Glu 


Asn Gly 


Lys 


Pne 


Ala Leu Asn Gly Lys 


Glu 


Pbe 


Arg 


Phe 


He 


Gly Ser 


AAC 


AAC 


171 

TAC TAC 


ATG 


180 

CAC 


TAC AAG 


189 

AGC AAC GGA 


198 
ATG 


ATA 


GAC 


207 
ACT 


GTT 


CTG 


216 
GAG 


Asn 


Asn 


Tyr Tyr 


Met 


His 


Tyr Lys 


Ser Asn Gly Mec 


He 


Asp 


Ser 


Val 


Leu 


Glu 


ACT 


ccc 


225 

AGA GAC 


ATG 


214 

GGT 


ATA AAG 


343 

GTC CTC AGA 


352 
ATC 


TGG 


GGT 


261 

TTC 


CTC 


CAC 


270 
GGG 



Ser Ala Arg Asp net Gly He Lys Val Leu Arg He Trp Gly Phe Leu Asp Gly 

279 288 297 306 315 324 

GAG ACT TAC TGC AGA GAC AAG AAC ACC TAC ATG CAT CCT GAG CCC GGT GTT TTC 

Glu Ser Tyr Cys Arg Asp Lys Asn Tnr Tyr Met His Pro Glu Pro Gly Val Phe 

333 342 351 360 369 378 

GGG CTG CCA GAA GGA ATA TCG AAC GCC CAG AGC GGT TTC GAA AGA CTC GAC TAC 

Gly Val Pro Glu Gly He Ser Asn Ala Gin Ser Gly Phe Glu Arg Leu Asp Tyr 

387 396 405 414 423 432 

ACA CTT GCG AAA GCG AAA GAA CTC GGT ATA AAA CTT GTC ATT GTT CTT GTG AAC 

Ttor Val Ala Lys Ala Lys Glu Leu Gly He Lys Leu Val He Val Leu Val Asn 

44X 450 459 468 477 486 

AAC TGG GAC GAC TTC GGT GGA ATG AAC CAG TAC GTG AGG TGG TTT GGA GGA ACC 

Asn Trp Asp Asp Phe Gly Gly Met Asn Gin Tyr Val Arg Trp Phe Gly Gly Thr 

49S 504 513 522 531 540 

CAT CAC GAC GAT TTC TAC AGA GAT CAG AAG ATC AAA GAA GAG TAC AAA AAG TAC 

His His Asp Asp Phe Tyr Arg Asp Glu Lys He Lys Glu Glu Tyr Lys Lys Tyr 

Figure 11 
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Tbaxaotoga 



( continued) (6 Or /U) 



549 558 567 576 585 594 

GTC TCC TTT CTC GTA AAC CAT GTC AAT ACC TAC ACG GGA GTT CCT TAC AGG GAA 

Val Ser Phe Leu Val Asn Hi* Val Asn THr Tyr Thr Gly Val Pro Tyr Arg Glu 

603 612 621 630 639 648 

GAG CCC ACC ATC ATG GCC TOG GAG CTT GCA AAC GAA CCG CCC TCT GAG ACG GAC 

Glu Pro Thr He Met Ala Trp Glu Leu Ala Asn Glu Pro Arg Cys Glu Thr Asp 

657 666 675 684 693 702 

AAA TCG GGG AAC ACG CTC GTT GAG TGG GTG AAG GAG ATG AGC TCC TAC ATA AAG 

Lys Ser Gly Ann Thr Leu Val Glu Trp Val Lys Glu Met Ser Ser Tyr lie Lys 

711 720 729 738 747 756 

AGT CTG GAT CCC AAC CAC CTC GTG GCT GTG GGG GAC GAA GGA TTC TIC AGC AAC 

Ser Leu Asp Pro Asn His Leu Val Ala Val Gly Asp Glu Gly Phe Phe Ser Asn 

765 774 783 792 801 810 

TAC GAA GGA TTC AAA CCT TAC GGT GGA GAA GCC GAG TGG GCC TAC AAC GGC TGG 

Tyr Glu Gly Phe Lye Pro Tyr Gly Gly Glu Ala Glu Trp Ala Tyr A»n Gly Trp 

819 828 837 846 855 864 

TCC GGT GTT GAC TGG AAG AAG CTC CTT TCG ATA GAG ACG GTG GAC TTC GGC ACG 

- ser Gly val^Asp -Trp-byo-fcy»-fceu-Le«-S«— 



873 882 891 900 909 918 

TTC CAC CTC TAT CCG TCC CAC TGG GGT GTC AGT CCA GAG AAC TAT GCC CAG TGG 

Phe His Leu Tyr Pro Ser His Trp Gly Val Ser Pro Glu Asn Tyr Ala Gin Trp 

927 936 945 954 963 972 

GGA GCG AAG TGG ATA GAA GAC CAC ATA AAG ATC GCA AAA GAG ATC GGA AAA CCC 

Gly Ala Lys Trp lie Glu Asp His lie Lys lie Ala Lys Glu lie Gly Lys Pro 

981 990 999 1008 1017 1026 

GTT GTT CTG GAA GAA TAT GGA ATT CCA AAG AGT GCG CCA GTT AAC AGA ACG GCC 

Val Val Leu Glu Glu Tyr Gly lie Pro Lys Ser Ala Pro Val Asn Arg Thr Ala 

1035 1044 1053 1062 1071 1080 

ATC TAC AGA CTC TGG AAC GAT CTG GTC TAC GAT CTC GGT GGA GAT GGA GCG ATG 

He Tyr Arg Leu Trp Asn Asp Leu Val Tyr Asp Leu Gly Gly Asp Gly Ala Met 

Figure 11 (Continued) 
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Tb«notogt Mritlaa P-mnmnnt iWSSStl ( oonei.ni 

1089 1098 1107 1116 1125 1134 

TTC TOG ATG CTC GCG GGA ATC GGO GAA GGT TCG GAC AGA GAC GAG AGA GGG TAC 

Phe Trp Met Leu Ala Gly lie Gly Glu Gly Ser Asp Arg Asp Glu Arg Gly Tyr 

1143 1152 1161 1170 1179 1188 

TAT CCG GAC TAC GAC GGT TTC AGA ATA GTG AAC GAC GAC AGT CCA GAA GCG GAA 

Tyr Pro Asp Tyr Asp Gly Phe Arg He Val Asn Asp Asp Sor Pro Glu Ala Glu 

1197 1206 1215 1224 1233 1242 

CTG ATA AGA GAA TAC GCG AAG CTO TTC AAC ACA GGT GAA GAC ATA AGA GAA GAC 

Leu He Arg Glu Tyr Ale Lya Leu Phe Asn Thr Gly Glu Asp Ho Arg Glu Asp 

1251 1260 1269 1278 1287 1296 

ACC TGC TCT TTC ATC CTT CCA AAA GAC GGC ATG GAG ATC AAA AAG ACC GTG GAA 

Thr Cys Ser Phe He Leu Pro Lys Asp Gly Met Glu He Lys Lys Thr Val Glu 

1305 1314 1323 1332 1341 1350 

GTG AGG GCT GGT CTT TTC GAC TAC AGC AAC ACQ TTT GAA AAG TTG TCT CTC AAA 

Val Arg Ala Gly Val Phe Asp Tyr Ser Asn Thr Phe Glu Lys Leu Ser Val Lys 

1359 1368 1377 1386 1395 1404 

GTC GAA GAT CTG GTT TTT GAA AAT GAG ATA GAG CAT CTC GGA TAC GGA ATP TAC 

Val Glu Asp Leu Val Phe Glu Asn Glu He Glu His Leu Gly Tyr Gly He Tyr 

1413 1422 1431 1440 1449 1458 

GGC TTT GAT CTC GAC ACA ACC CGG ATC CCG GAT GGA GAA CAT GAA ATG TTC CTT 

Gly Phe Asp Leu Asp Thr Thr Arg He Pro Asp Gly Glu His Glu Met Phe Leu 

1467 1476 1485 1494 1503 1512 

GAA GGC CAC TTT CAG GGA AAA ACG GTG AAA GAC TCT ATC AAA GCG AAA GTG GTG 

Glu Gly His Phe Gin Gly Lys Thr Val Lys Asp Ser He Lys Ala Lys Val Val 

1521 1530 1539 1548 1557 1566 

AAC GAA GCA CGG TAC GTG CTC GCA GAG GAA CTT GAT TTT TCC TCT CCA GAA GAG 

Asn Glu Ala Arg Tyr Val Leu Ala Glu Glu Val Asp Phe Ser Ser Pro Glu Glu 

1575 1584 1593 1602 1611 1620 

GTG AAA AAC TGG TGG AAC AGC GGA ACC TOG CAG GCA GAG TTC GGG TCA CCT GAC 

Val Lys Asn Trp Trp Asn Ser Gly Thr Trp Gin Ala Glu Phe Gly Ser Pro Asp 

Figure 11 (Continued) 
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Thtrmotoat mtritiiu P>BinniiiiM (oontinutd) G P 

1629 1638 1647 16S6 1665 1674 

ATT GAA TGG AAC GGT GAG GTG GGA AAT GGA GCA CTG CAG CTG AAC GTG AAA CTG 

lie Glu Trp Asn Gly Glu Val Gly Asn Gly Ala Leu Gin Leu Asn Val Lys Leu 

1683 1692 1701 1710 1719 1728 

CCC GGA AAG AGC GAC TGG GAA GAA GTG AGA GTA GCA AGG AAG TTC GAA AGA CTC 

Fro Gly Lys Ser Asp Trp Glu Glu Val Arg Val Ala Arg Lys Phe Glu Arg Leu 

1737 1746 1755 1764 1773 1782 

TCA GAA TGT GAG ATC CTC GAG TAC GAC ATC TAC ATT CCA AAC GTC GAG GGA CTC 

Ser Glu Cys Glu lie Leu Glu Tyr Asp lie Tyr lie Pro Asn Val Glu Gly Leu 

1791 1800 1809 1818 1827 1836 

AAG GGA AGG TTG AGG CCG TAC GCG GTT CTG AAC CCC GGC TGG GTG AAG ATA GGC 

Lys Gly Arg Leu Arg Pro Tyr Ala Val Leu Asn Pro Gly Trp Val Lys lie Gly 

1845 1854 1863 1872 1881 1890 

CTC GAC ATG AAC AAC GCG AAC GTG GAA ACT GCG GAG ATC ATC ACT TTC GGC GGA 

Leu Asp Met Asn Asn Ala Asn Val Glu Ser Ala Glu lie Zle Thr Phe Gly Gly 

1899 1908 1917 1926 1935 1944 

AAA GAG TAC AGA AGA TTC CAT GTA AGA ATT GAG TTC GAC AGA ACA GCG GGG GTG 

Lys G lu Tyr A r g Arg Phe His Va l Arg lie Glu Ph e Asp Ar g Thr Ala Gly Val 

1953 1962 1971 1980 1989 1998 

AAA GAA CTT CAC ATA GGA GTT GTC GGT GAT CAT CTG AGG TAC GAT GGA CCG ATT 

Lys Glu Leu His He Gly Val Val Gly Asp His Leu Arg Tyr Asp Gly Pro Zle 

2007 2016 2025 2034 2043 

TTC ATC GAT AAT GTG AGA CTT TAT AAA AGA ACA GGA GGT ATG TGA 3* 

Phe He Asp Asn Val Arg Leu Tyr Lys Arg Thr Gly Gly Met: *•* 



Figure 11 (Continued) 
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JUETXX la p-MBno«id«i» (630B1) 

9 18 27 36 45 54 

5 ' ATG CTA CCA GAA GAG TTC CTA TGG CCC GTT CGG CAG TCA GGC TTT CAG TTC GAA 

Hot: Leu Pro Glu Glu Phe Leu Trp Gly Val Gly Gin Scr Gly Phe Gin Phe Glu 

63 72 81 90 99 108 

ATG GGC GAC AAG CTC AGG AGG CAC ATC GAT CCA AAT ACC GAC TGG TGG AAG TGG 

Mot Gly Asp Lys Lou Arg Arg Ri» He Asp Pro Asn Thr Asp Trp Trp Lys Trp 

117 126 135 144 153 162 

GTT CGC GAT CCT TTC AAC ATA AAA AAG GAG CTT GTG ACT GOG GAC CTT CCC GAG 

Val Arg Asp Pro Pho Aati Ilo Lys Lys Glu Lou Val Scr Gly Asp Leu Pro Glu 

171 180 189 198 207 216 

GAC GGC ATC AAC AAC TAC GAA CTT TTT GAA AAC GAT CAC AAG CTC GCT AAA GGC 

Asp Gly He Asn Asn Tyr Glu Leu Phe Glu Asn Asp His Lys Leu Ala Lys Gly 

225 234 243 252 261 270 

CTT GGA CTC AAC GCA TAC AGG ATT GGA ATA GAG TGG AGC AGA ATC TTT CCC TGG 

Leu Gly Leu Asn Ala Tyr Arg He Gly lie Glu Trp Ser Arg lie Phe Pro Trp 

279 288 297 306 315 324 

CCG ACQ TGG ACG GTC GAT ACC GAG GTC GAG TTC GAC ACT TAC GCT TTA GTA AAC 

Pro Thr Trp Thr Val Asp Thr Glu Val Glu Phe Asp Thr Tyr Gly Leu Val Lys 

333 342 351 360 369 378 

GAC GTT AAG ATA GAC AAG TCC ACC CTT GCT GAA CTC GAC AGG CTG GCC AAC AAG 

Asp Val Lys He Asp Lys Ser Thr Leu Ala Glu Leu Asp Arg Leu Ala Asn Lys 

387 396 405 414 423 432 

GAG GAG GTA ATG TAC TAC AGG CGC GTT ATT CAG CAT TTG AGG GAG CTC GGC TTC 

Glu Glu Val Met Tyr Tyr Arg Arg Val He Gin His Leu Arg Glu Leu Gly Phe 

441 450 459 468 477 486 

AAG GTC TTC GTT AAC CTC AAC CAC TTC ACQ CTT CCA ATA TGG CTC CAC GAC CCG 

Lys Val Phe Val Asn Leu Asn His Phe Thr Leu Pro He Trp Leu His Asp Pro 

495 504 513 522 531 540 

ATA GTG GCA AGG GAG AAG GCC CTC ACA AAC GAC AGA ATC GGC TGG GTC TCC CAG 

He Val Ala Arg Glu Lys Ala Leu Thr Asn Asp Arg He Gly Trp Val Ser Gin 

Figure 12 
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HI la p-MDao«id*i« (63QB1) (continued) 

549 558 567 576 5B5 594 

AGO ACA CTT GTT GAG TTT GCC AAG TAT GCT GCT TAC ATC GCC CAT GCG CTC GGA 

Arg Thr Val Val Glu Phe Ala Lys Tyr Ala Ala Tyr lie Ala His Ala Leu Gly 

603 612 621 MO 639 648 

GAC CTC GTG GAC ACA TGG AGC ACC TTC AAC GAA CCT ATG GTA GTT GTO GAG CTC 



Asp 



Leu Val Asp Thr Trp Ser Thr Phe Asn Glu Pro Met Val Val Val Glu Leu 



SSI S66 675 684 693 702 

GGC TAC CTC GCC CCC TAC TCA GGA TTT CCC CCG GGA GTC ATG AAC CCC GAG GCC 

Gly Tyr Lou Ala Pro Tyr Ser Gly Phe Pro Pro Gly Val Met Asn Pro Glu Ala 

711 720 729 738 747 756 

GCG AAG CTG GCO ATC CTC AAC ATG ATA AAC GCC CAC GCC TTG GGA TAT AAG ATG 

Ala Lys Leu Ala lie Leu Asa Met lie Asn Ala His Ala Leu Ala Tyr Lye Met 

765 774 783 792 801 810 

ATA AAG AGG TTC GAC ACC AAG AAG GCC GAT GAG GAT AGC AAG TCC CCT GCG GAC 



lie Lys Arg 



Phe Asp Thr Lye Lys Ala Asp Glu Asp Ser Lys Ser Pro Ala Asp 



819 828 837 846 855 864 

GTT GGC ATA ATT TAC AAC AAC ATC GGT GTT GCC TAC CCT AAA GAC CCT AAC GAT 

Val Gly tia Tie Tyr A sn Asn lie Gl y Val Ala Tyr Pro Lys Asp Pro Asn Asp 

873 882 891 900 909 918 

CCC AAG GAC GTT AAA GCA GCC GAA AAC GAC AAC TAC TTC CAC ACC GGA CTG TTC 

Pro Lys Asp val Lys Ala Ala Glu Asn Asp Asn Tyr Phe His Ser Gly Leu Phe 

927 936 945 954 963 972 

TTT GAT GCC ATC CAC AAG GGT AAG CTC AAC ATA GAG TTC GAC GGC GAA AAC TTT 

Phe Asp Ala lie His Lys Gly Lys Leu Asn lie Glu Phe Asp Gly Glu Asn Phe 

90X 990 999 1008 1017 1026 

GTA AAA GTT AGA CAC CTA AAA GGC AAT GAC TGG ATA GGC CTC AAC TAC TAC ACC 

Val Lys Val Arg His Leu Lys Gly Asn Asp Trp lie Gly Leu Asn Tyr Tyr Thr 

X035 1044 1053 1062 1071 1080 

CGC GAG GTT GTT AGA TAT TCG GAG CCC AAG TTC CCA ACT ATA CCC CTC ATA TCC 

Arg Glu Val Val Arg Tyr Ser Glu Pro Lys Phe Pro Ser He Pro Leu lie Ser 

Figure 12 (Continued) 
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AXFZZ la P-M&ao*id»»« (630B1) (contioutd) 

1089 1098 1107 1116 1125 1134 

TTC AAG GGC CTT CCC AAC TAC GGC TAC TCC TGC AGG CCC GGC ACO ACC TCC CCC 

Phe Lys Oly Vol Fro Asn Tyr Gly Tyr Ser Cya Arg Pro Cly Thr Thr Ser Ala 

1143 1152 U61 1170 1179 1188 

GAT GGC ATG CCC GTC AGO GAT ATC GGC TGG GAA GTC TAT CCC CAG GGA ATC TAC 

Asp Gly Met Pro Vol Ser Asp He Gly Trp Glu Val Tyr Pro Gin Gly He Tyr 

1197 1206 1215 1224 1233 1242 

GAC TCG ATA GTC GAG GCC ACC AAG TAC ACT GTT CCT GTT TAC GTC ACC GAG AAC 

Abp Ser He Val Glu Ala Thr Lys Tyr Ser Val Pro Val Tyr Val Thr Glu Asn 

1251 1260 1269 1278 1287 1296 

OCT GTT GCG GAT TCC GCG GAC ACG CTG AGG CCA TAC TAC ATA GTC AGC CAC GTC 

Gly Val Ala Asp Ser Ala Asp Thr Leu Arg Pro Tyr Tyr He Val Sex Hi* Val 

1305 131* "23 1332 1341 1350 

TCA AAG ATA GAG GAA GCC ATT GAG AAT GGA TAC CCC GTA AAA GGC TAC ATG TAC 

Ser Lys He Glu Glu Ala He Glu Asn Gly Tyr Pro Val Lys Gly Tyr Met Tyr 

1359 1368 1377 1386 1395 1404 

TGG GCG CTT ACG GAT AAC TAC GAG TGG GCC CTC GGC TTC AGC ATG AGG TTT GGT 

Trp Ala Leu Thr Asp Asn Tyr Glu Trp Ala Leu Gly Phe Ser Met Arg Phe Gly 

1413 1422 1431 1440 1449 1458 

CTC TAC AAG GTC GAC CTC ATC TCC AAG GAG AGG ATC CCG AGG GAG AGA AGC GTT 

Leu Tyr Lys Val Asp Leu He Ser Lys Glu Arg He Pro Arg Glu Arg Ser Val 

1467 1476 1485 1494 1503 1512 

GAG ATA TAT CGC AGG ATA CTG CAG TCC AAC GGT GTT CCT AAG GAT ATC AAA GAG 

Glu He Tyr Arg Arg He Val Gin Ser Asn Gly Val Pro Lys Asp He Lys Glu 

1521 1530 1539 

GAG TTC CTG AAG GGT GAG GAG AAA TCA 3* 

Glu Phe Leu Lys Gly Glu Glu Lys ••• 



Figure 12 (Continued) 
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OC1/4V KndoglttctttAi* (3 30P1) 



ATG 


GTA 


9 

GAA 


AGA 


18 

CAC TTC 


AGA 


27 

TAT GTT 


36 

CTT ATT TGC 


ACC 


era 


45 

TTT 


CTT 


GTT 


54 

ATG 


Met 


Val 


GlU 


Arg 


His Phe 


Arg Tyr Val 


Leu He Cys 


Thr 


Leu 


Phe 


Leu 


Val 


Met 


CTC 


CTA 


63 
ATC 


TCA 


72 

TCC ACT 


CAG 


81 

TGT GGA 


90 

AAA AAT GAA 


CCA 


AAC 


99 
AAA 


AGA 


GTG 


108 
AAT 


Leu 


Leu 


He 


Ser 


Ser Thr 


Gin 


Cys Gly Lys Asn Glu 


Pro 


Asn 


Lys 


Arg 


Val 


Asn 


AGC 


ATG 


117 
GAA 


CAG 


126 
TCA GTT 


GCT 


135 
GAA AGT 


144 

GAT AGC AAC 


TCA 


GCA 


153 
TTT 


GAA 


TAC 


162 
AAC 


Ser 


Met 


GlU 


Gin 


Ser Val Ala Glu Ser Asp Ser Asn Ser 


Ala 


Phe 


Glu 


Tyr 


Asn 


AAA 


ATG 


171 
GTA 


GGT 


180 
AAA GGA 


GTA 


189 
AAT ATT 


196 

GGA AAT GCT 


TTA 


GAA 


207 

GCT 


OCT 


TTC 


216 
GAA 


Lys 


Met 


Val 


Gly 


Lys Gly Val 


Asn He Gly Asn Ala Leu Glu Ala 


Pro 


Phe 


Glu 


GGA 


GCT 


225 
TGG 


GGA 


234 
GTA AGA 


ATP 


243 

GAG GAT 


252 

GAA TAT TTT 


GAG 


261 270 
ATA ATA AAG AAA AGG 


Oly Ala Trp GXy VaX Arg 


He 


Glu Asp Glu Tyr Phe Glu 


He 


He 


Lys 


Lys 


Arg 


GGA 


TTT 


279 
GAT 


TCT 


288 
GTT AGG 


ATT 


297 
CCC ATA 


306 

AGA TGG TCA 


GCA 


CAT 


315 
ATA 


TCC 


GAA 


324 
AAG 


Gly 


Phe 


Asp 


Ser 


Val Arg 


He 


Pro He 


Arg Trp Ser 


Ala 


His 


He 


Ser 


Glu 


Lys 


CCA 


CCA 


333 
TAT 


GAT 


342 

ATT GAC 


AGG 


351 
AAT TTC 


360 

CTC GAA AGA 


GTT 


AAC 


369 
CAT 


GTT 


GTC 


378 
GAT 


Pro 


Pro 


Tyr 


Asp 


He Asp Arg Asn Phe 


Leu Glu Arg 


Val 


Asn 


His 


Val 


Val 


Asp 


AGG 


GCT 


387 

CTT 


GAG 


396 
AAT AAT 


TTA 


405 

ACA GTA 


414 

ATC ATC AAT 


ACG 


CAC 


423 

CAT 


TTT 


GAA 


432 
GAA 


Arg 


Ala 


Leu 


Glu 


Asn Asn 


Leu 


Thr Val 


He lie Asn 


Thr 


His 


His 


Phe 


Glu 


Glu 


CTC 


TAT 


441 

CAA 


GAA 


450 

CCG GAT 


AAA 


459 

TAC GGC 


468 

GAT GTT TTG 


GTG 


GAA 


477 
ATT 


TGG 


AGA 


486 

CAG 


Lou 


Tyr 


Gin 


Glu 


Pro Asp 


Lys 


Tyr Gly Asp Val Leu Val 


Glu 


lie 


Trp Arg Gin 


ATT 


GCA 


495 
AAA 


TTC 


504 
TTT AAA 


GAT 


513 
TAC CCG 


522 

GAA AAT CTG 


TTC 


TTT 


531 
GAA 


ATC 


TAC 


540 
AAC 
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OC1/4V Kndogluoaiiase (330»1> (continue*) 

5 * 9 558 567 576 585 594 

GAG CCT GCT CAG AAC TTG ACA GCT GAA AAA TGG AAC GCA CTT TAT CCA AAA GTC 



Glu Pro 



Ala Gin Asn Leu Thr Ala Glu Lys Trp Asn Ala Leu Tyr Pro Lys Val 



603 612 621 630 639 648 

CTC AAA GTT ATC AGG GAG AGC AAT CCA ACC CGG ATT GTC ATT ATC GAT GCT CCA 

Leu Lys Val lie Axg Glu Ser Asn Pro Thr Arg lie Val He He Asp Ala Pro 

657 666 675 684 693 702 

AAC TGG CCA CAC TAT AGC GCA GTG AGA ACT CTA AAA TTA GTC AAC GAC AAA CGC 

Asn Trp Ala His Tyr Ser Ala Val Arg Ser Leu Lys Leu Val Asn Asp Lys Arg 

m 720 729 738 747 756 

ATC ATT GTT TCC TTC CAT TAC TAC GAA CCT TTC AAA TTC ACA CAT CAG GGT G^ 

III lie Val Ser Ph. His Tyr Tyr Glu Pro Phe Lys Phe Thr His Gin Gly Ala 

765 774 783 792 801 810 

GAATCCOTWTCCCATCCCACacnTAGGGTTAAGTOTmGGC^GAATPG 

Olu Tri Val Asn Pro He Pro Pro Val Arg Val Lys Trp Asn Gly Glu Glu Trp 

819 828 837 846 855 864 

GAA ATT AAC CAA ATC AGA ACT CAT TTC AAA TAC GTG AGT GAC TGG GCA *AG CAA 

Olu lie Asn Gin lie Arg Ser His Phe Lys Tyr Val Ser Asp Trp Ala Lys Gin 

B73 882 891 900 909 918 

AAT AAC GTA CCA ATC TTT CTT GGT GAA TTC GGT GCT TAT TCA AAA GCA GAC ATG 



Asn Asn Val Pro 



He Phe Leu Oly Glu Phe Gly Ala Tyr Ser Lys Ala Asp Met 



927 936 945 954 963 972 

GAC TCA AGG GTT AAG TGG ACC GAA AGT GTG AGA AAA ATG GCG GAA GAA TTT GG * 

~ S~~ Zri val Lys Trp Thr Glu Ser Val Arg Lys Met Ala Glu Glu Phe Gly 

9 8i 990 999 1008 1017 1026 

TTT TCA TAC GCG TAT TGG GAA TTT TOT GCA GGA TTT GGC ATA TAC GAT AGA TGG 

Phe sir Tyr Ala Tyr Trp Glu Phe Cys Ala Gly Phe Gly He Tyr Asp Arg Trp 

103 S 1044 1053 1062 1071 10B0 

TCT CAA AAC TGG ATC GAA CCA TTG GCA ACA GCT GTG GTT GGC ACA GGC AAA 

Ser Gin Asn Trp III Glu Pro Leu Ala Thr Ala Val Val Oly Thr Gly Lys Glu 
TAA 3' 



• •• 



Figure 13 (Continued) 
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Thersotog* Mritl»A pullulates* (6o?3) 

9 18 27 36 45 54 

5' ATG GAT CTT ACA AAG GTG GGG ATC ATA GTG AGG CTG AAC GAG TGG CAG GCA AAA 

Met Asp Lou Thr Lys Val Gly lie He Val Arg Leu Asn Glu Txrp Gin Ala Lys 

63 72 81 90 99 108 

GAC GTG GCA AAA GAC AGG TTC ATA GAG ATA AAA GAC GGA AAG GCT GAA GTG TGG 

Asp Val Ala Lys Asp Arg Phe He Glu lie lys Asp Gly Lys Ala Glu Val Trp 

117 126 135 144 153 162 

ATA CTC CAG GGA GTG GAA GAG ATT TTC TAC GAA AAA CCA GAC ACA TCT CCC AGA 

He Leu Gin Gly Val Glu Glu lie Phe Tyr Glu Lys Pro Asp Thr Ser Pro Arg 

171 180 189 198 207 216 

ATC TTC TTC GCA CAG GCA AGG TCG AAC AAG GTG ATC GAG GCT TTT CTG ACC AAT 

tie Phe Phe Ala Gin Ala Arg Ser Asn Lys Val lie Glu Ala Phe Leu Thr Asn 

225 234 243 252 261 270 

CCT GTG GAT ACG AAA AAG AAA GAA CTC TTC AAG GTT ACT GTT GAC GGA AAA GAG 

Pro Val Asp Thr Lys Lys Lys Glu Leu Phe Lys Val Thr Val Asp Gly Lys Glu 

279 288 297 306 315 324 

ATT CCC GTC TCA AGA GTG GAA AAG GCC GAT CCC ACG GAC ATA GAC GTG ACG AAC 

He Pro Val Ser Arg Val Glu Lys Ala Asp Pro Thr Asp Ho Asp Val Thr Asn 

333 342 351 360 369 378 

TAC GTG AGA ATC GTC CTT TCT GAA TCC CTG AAA GAA GAA GAC CTC AGA AAA GAC 

Tyr Val Arg He Vol Leu Ser Glu Ser Leu Lys Glu Glu Asp Leu Arg Lys Asp 

387 396 405 414 423 432 

GTG GAA CTG ATC ATA GAA GOT TAC AAA CCG GCA AGA GTC ATC ATG ATG GAG ATC 

Val Glu Leu He He Glu Gly Tyr Lys Pro Ala Arg Val He Her Met: Glu He 

441 450 459 468 477 486 

CTG GAC GAC TAC TAT TAC GAT GGA GAG CTC GGA GCC GTA TAT TCT CCA GAG AAG 

Leu Asp Asp Tyr Tyr Tyr Asp Gly Glu Leu Gly Ala Val Tyr Ser Pro Glu Lys 

495 504 513 522 531 540 

ACG ATA TTC AGA CTC TGG TCC CCC GTT TCT AAG TGG GTA AAG GTG CTT CTC TTC 

Thr He Phe Arg Val Trp Ser Pro Val Ser Lys Trp Val Lys Val Leu Leu Phe 

Figure 14 
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Thermotoff* atritiat ?nllul«Dat« <60»3) (continutd) 

549 558 567 576 585 594 

AAA AAC GGA GAA GAC ACA GAA CCG TAC CAG GTT CTG AAC ATG GAA TAC AAG GGA 

Lys Asn Gly Glu Asp Thr Glu Pro Tyr Gin Val Val Asn Met Glu Tyr Lys Gly 

603 612 621 630 639 648 

AAC GGG GTC TGG GAA GCG GTT GTT GAA GGC GAT CTC GAC GGA GTG TTC TAC CTC 

Asn Gly Val Trp Glu Ala Vol Val Glu Gly Asp Leu Asp Gly Val Phe Tyr Leu 

657 666 675 684 693 702 

TAT CAG CTG GAA AAC TAC GGA AAG ATC AGA ACA ACC GTC GAT CCT TAT TCG AAA 

Tyr Gin Leu Glu Asn Tyr Gly Lys lie Arg Thr Tttr Val Asp Pro Tyr Ser Lys 

711 720 729 738 747 756 

CCG GTT TAC GCA AAC AAC CAA GAG AGC GCC GTT CTG AAT CTT GCC AGO ACA AAC 

Ala Val Tyr Ala Asn Asn Gin Glu Ser Ala Val Val Asn Leu Ala Arg Thr Asn 

765 774 783 792 801 810 

CCA GAA GGA TGG GAA AAC GAC AGG GGA CCG AAA ATC GAA GGA TAC GAA GAC GCG 

Pro Glu Gly Trp Glu Asn Asp Arg Gly Pro Lys lie Glu Gly Tyr Glu Asp Ala 

819 82B 837 846 855 864 

ATA ATC TAT GAA ATA CAC ATA GCC GAC ATC ACA GGA CTC GAA AAC TCC GGG GTA 

He He Tyr Glu He His He Ala Asp He Thr Gly Leu Glu Asn Ser Gly Val 

873 882 891 900 909 918 

AAA AAC AAA GGC CTC TAT CTC GGG CTC ACC GAA GAA AAC ACG AAA GGA CCG GGC 

Lys Asn Lys Gly Leu Tyr Leu Gly Leu Thr Glu Glu Asn Thr Lys Gly Pro Gly 

927 936 945 954 963 972 

GGT GTG ACA ACA GGC CTT TCG CAC CTT GTG GAA CTC GGT GTT ACA CAC GTT CAT 

Gly Val Thr Thr Gly Leu Ser His Leu Val Glu Leu Gly Val Thr His Val His 

981 990 999 1008 1017 1026 

ATA CTT CCT TTC TTT GAT TTC TAC ACA GGC GAC GAA CTC GAT AAA GAT TTC GAG 

He Leu Pro Phe Phe Asp Phe Tyr Thr Gly Asp Glu Leu Asp Lys Asp Phe Glu 

1035 1044 1053 1062 1071 1080 

AAG TAC TAC AAC TOO GGT TAC GAT CCT TAC CTG TTC ATG GTT CCG GAG GGC AGA 

Lys Tyr Tyr Asn Trp Gly Tyr Asp Pro Tyr Leu Phe Met Val Pro Glu Gly Arg 

Figure 14 (Continued) 
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Thormotog* Mritist Fallulnu«« <<QF3> (eoatl&n«d) 

1089 1098 1107 1116 1125 1134 

TAC TCA ACC GAT CCC AAA AAC CCA CAC ACG AGA ATC AGA GAA GTC AAA GAA ATG 

Tyx Ser Thr Asp Pro Lys Asn Pro His Thr Arg He Arg Glu Val Lys Glu Met 

1143 1152 1161 1170 1179 H88 

GTC AAA GCC CTT CAC AAA CAC GGT ATA GGT GTG ATT ATG GAC ATG GTG TTC CCT 

Val Lys Ala Leu His Lys His Gly Ho Gly Val Ho Met Asp Mot Val Pho Pro 

1197 1206 1215 1224 1233 1242 

CAC ACC TAC GGT ATA GGC GAA CTC TCT GCG TTC GAT CAC ACG GTG CCG TAC TAC 

His Thr Tyx Qly He Gly Glu Leu Ser Ala Phe Asp Gin Thr Val Pro Tyr Tyr 

1251 1260 1269 1278 1287 1296 

TTC TAC AGA ATC GAC AAG ACA GGT GCC TAT TTO AAC GAA AGC GGA TGT GGT AAC 

Pho Tyr Arg He Asp Lys Thr Gly Ala Tyr Lou Asn Glu Ser Gly Cys Gly Asn 

1305 1314 1323 1332 1341 1350 

GTC ATC GCA AGC GAA AGA CCC ATG ATG AGA AAA TTC ATA GTC GAT ACC GTC ACC 

Val Ho Ala Ser Glu Arg Pro Met Met Arg Lys Phe He Val Asp Thr Val Thr 

1359 1368 1377 1386 1395 1404 

TAC TGO GTA AAG GAG TAT CAC ATA GAC GGA TTC AGG TTC GAT CAG ATG GGT CTC 

Tyr— Trp_Val_Lys Glu_Tyr_His lie Asp Gly_Phe Arg^Phe^ Asp -Gln-Me t-Gly-Lou 

1413 1422 1431 1440 1449 1458 

ATC GAC AAA AAG ACA ATG CTC GAA GTC GAA AGA GCT CTT CAT AAA ATC GAT CCA 

He Asp Lys Lys Thr Met Leu Glu Val Glu Arg Ala Leu His Lys He Asp Pro 

1467 1476 1485 1494 1503 1512 

ACT ATC ATT CTC TAC GGC GAA CCG TGG GGT GGA TOG GGA GCA CCG ATC AGG TTT 

Thr He He Lou Tyr Gly Glu Pro Trp Gly Gly Trp Gly Ala Pro Ho Arg Pho 

1521 1530 1539 1548 1557 1566 

GGA AAG AGC GAT GTC GCC GGC ACA CAC GTG GCA GCT TTC AAC GAT GAG TTC AGA 

Gly Lys Ser Asp Val Ala Gly Thr His Val Ala Ala Phe Asn Asp Glu Phe Arg 

1575 1584 1593 1602 1611 1620 

GAC GCA ATA AGG GGT TCC GTG TTC AAC CCG AGC GTC AAG GGA TTC GTC ATG GGA 

Asp Ala Ho Arg Gly Sor Val Pho Asn Pro Ser Val Lys Gly Phe Val Met Gly 

Figure 14 (Continued) 
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Th*zBotO0« atritiaa Pullulanase (6GP3) (eoatlna»d) 

1629 163B 1647 1656 1665 1674 

GGA TAC GGA AAG GAA ACC AAG ATC AAA AGG GGT GTT GTT GGA AGC ATA AAC TAC 

Gly Tyr Gly Lys Glu Thx Lys He Lys Arg; Gly Val Val Gly Ser Ho Asn Tyr 

1683 1692 1701 1710 1719 1728 

GAC GGA AAA CTC ATC AAA ACT TTC GCC CTT GAT CCA GAA GAA ACT ATA AAC TAC 

Asp Gly Lys Leu He Lys Ser Phe Ala Leu Asp Pro Glu Glu Thx He Ann Tyr 

1737 1746 1755 1764 1773 1782 

GCA GCG TGT CAC GAC AAC CAC ACA CTG TGG GAC AAG AAC TAC CTT GCC GCC AAA 

Ala Ala Cyo His Asp Asn His Thx Leu Txp Asp Lys Asn Tyr Leu Ala Ala Lys 

1791 1800 1809 1818 1827 1836 

GCT GAT AAG AAA AAG GAA TGG ACC GAA GAA GAA CTG AAA AAC GCC CAG AAA CTG 

Ala Asp Lys Lyo Lys Glu Trp Thr Glu Glu Glu Leu Lys Asn Ala Gin Lys Leu 

1845 1854 1863 1872 1881 1890 

GCT GGT GCO ATA CTT CTC ACT TCT CAA GGT GTT CCT TTC CTC CAC GGA GGG CAG 

Ala Gly Ala He Leu Leu Thr Ser Gin Gly Val Pro Phe Leu His Gly Gly Gin 

1899 1908 1917 1926 1935 1944 

GAC TTC TGC AGG ACG ACG AAT TTC AAC GAC AAC TCC TAC AAC GCC CCT ATC TCG 

Asp Phe Cys Arg Thr Thr Asn Hie Asn Asp Asn Ser Tyr Asn Ala Pro He Ser 

1953 1962 1971 1980 1989 1998 

ATA AAC GGC TTC GAT TAC GAA AGA AAA CTT CAG TTC ATA GAC GTG TTC AAT TAC 

He Asn Gly Phe Asp Tyr Glu Arg Lye Leu Gin Phe He Asp Val Phe Asn Tyr 

2007 2016 2025 2034 2043 2052 

CAC AAG GGT CTC ATA AAA CTC AGA AAA GAA CAC CCT GCT TTC AGG CTG AAA AAC 

His Lys Gly Leu He Lys Leu Arg Lys Glu His Pro Ala Phe Arg Leu Lys Asn 

2061 2070 2079 2088 2097 2106 

GCT GAA GAG ATC AAA AAA CAC CTG GAA TTT CTC CCG GGC GGC AGA AGA ATA GTT 

Ala Glu Glu He Lys Lys His Leu Glu Phe Leu Pro Gly Gly Arg Arg He Val 

2115 2124 2133 2142 2151 2160 

GCG TTC ATG CTT AAA GAC CAC GCA GGT GGT GAT CCC TGG AAA GAC ATC GTG GTG 

Ala Phe Met Leu Lys Asp His Ala Gly Gly Asp Pro Trp Lys Asp He Val Val 

Figure 14 (Continued) 
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ThtraotO0A maritima fullulasasa ( 6 OP 3 ) (oootiauta) 

2169 2178 21S7 2196 220S 2214 

ATT TAC AAT GGA AAC TTA GAG AAG ACA ACA TAG AAA CTG CCA GAA GGA AAA TOG 

He Tyr Asn Gly Asn Leu Glu Lys Tfcr Thr Tyx Lys Leu Pro Glu Oly Lys Trp 

2223 2232 2241 2250 2259 2268 

AAT GTG GTT GTG AAC AGC CAG AAA GCC GGA ACA GAA GTG ATA GAA ACC GTC GAA 

Asn Val Val Val Asn Ser Gin Lys Ala Gly Thr Glu Val lie Glu Thr Val Glu 

2277 2286 2295 2304 2313 

GGA ACA ATA GAA CTC GAT CCG CTT TCC GCG TAC GTT CTG TAC AGA GAG TGA 3* 

Gly Tfax lie Glu Leu Asp Pro Leu Ser Ala Tyr Val Leu Tyr Arg Glu *** 



Figure 14 (Continued) 
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